CN114020651B - Interface address based duplicate removal method, device, equipment and readable storage medium - Google Patents

Interface address based duplicate removal method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN114020651B
CN114020651B CN202210007597.7A CN202210007597A CN114020651B CN 114020651 B CN114020651 B CN 114020651B CN 202210007597 A CN202210007597 A CN 202210007597A CN 114020651 B CN114020651 B CN 114020651B
Authority
CN
China
Prior art keywords
interface
interface address
address
denoising
storage area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210007597.7A
Other languages
Chinese (zh)
Other versions
CN114020651A (en
Inventor
林存练
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Mingyuan Cloud Technology Co Ltd
Original Assignee
Shenzhen Mingyuan Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Mingyuan Cloud Technology Co Ltd filed Critical Shenzhen Mingyuan Cloud Technology Co Ltd
Priority to CN202210007597.7A priority Critical patent/CN114020651B/en
Publication of CN114020651A publication Critical patent/CN114020651A/en
Application granted granted Critical
Publication of CN114020651B publication Critical patent/CN114020651B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a duplication removing method, a duplication removing device, equipment and a readable storage medium based on an interface address, which relate to the field of API (application program interface) interface processing, and the duplication removing method based on the interface address comprises the following steps: when a front-end application requests an interface from a back-end, acquiring an interface address of the interface; performing regular matching denoising on the interface address through the etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly; and acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interfaces to remove repeated interface addresses. Therefore, different interface addresses of the same original interface are unified, statistical analysis of relevant data of the interface is facilitated, and the technical problem that the statistical analysis is difficult due to the fact that a plurality of interface addresses appear in the same original interface during statistics is solved.

Description

Interface address based duplicate removal method, device, equipment and readable storage medium
Technical Field
The invention relates to the field of API (application program interface) interface processing, in particular to a method, a device and equipment for removing duplicate based on an interface address and a readable storage medium.
Background
With the rapid development of the cloud native technology, the iteration and delivery cycle of the product are more frequent and intensive, and the delivery number of API (Application Programming Interface) interfaces is increasing day by day. In the application monitoring field, the analysis of the API becomes important, and the error, exception, speed and loss of the API on the complete link behind the API all need to be clearly and completely described and explained.
However, in some cases, for example, as a paas platform, the API service capability provided externally, the product generates various urls with different ids through drag and pull, or under various calls, the ids are generated through alphanumeric combination, and are essentially the same API. However, this situation causes troubles and even masks problems in the API data statistics and report analysis, which is not favorable for API data statistics and analysis.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide an interface address-based deduplication method, aiming at solving the technical problem that the currently generated API is not beneficial to the subsequent statistical analysis of API data.
In order to achieve the above object, the present invention provides a duplication removing method based on an interface address, which comprises the following steps:
when a front-end application requests an interface from a back-end, acquiring an interface address of the interface;
performing regular matching denoising on the interface address through the etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly;
and acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interfaces to remove repeated interface addresses.
Further, the step of obtaining the interface address of the interface includes:
acquiring the interface address from the back end through a collector;
sending the interface address to a pulsar message queue through the aggregator;
and subscribing the etl as the interface address in the pulsar message queue by the consumer.
Further, the step of generating a denoised interface address by performing regular matching denoising on the interface address through the etl includes:
receiving the interface address sent by the pulsar message queue through the etl, and extracting an address uri from the interface address;
and matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address.
Further, the step of writing the interface address and the denoising interface address into a preset storage area correspondingly comprises:
and taking the preset storage area as a consumer of the pulsar message queue to subscribe the denoising interface address and the interface address corresponding to the denoising interface address.
Further, after the step of generating a denoised interface address by performing canonical matching denoising on the interface address through the etl, the method further includes:
and carrying out encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a label of the interface address.
Further, the step of performing encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a tag of the interface address includes:
performing encryption operation on the denoising interface address through the etl based on an MD5 algorithm, and taking the result of the encryption operation as the interface fingerprint of the interface address;
and writing the interface fingerprint into the pulsar message queue as a label of the interface address, and taking the preset storage area as a consumer of the pulsar message queue to obtain the interface address and the interface fingerprint as the label of the interface address.
Further, the step of obtaining the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface address corresponding to the same denoising interface address as a class of interface includes:
acquiring the interface fingerprint generated according to the denoising interface address and the interface address corresponding to the denoising interface address from the preset storage area;
and classifying the interface addresses corresponding to the same interface fingerprints into a class of interfaces.
In addition, to achieve the above object, the present invention further provides an interface address based deduplication device, including:
the acquisition module is used for acquiring an interface address of an interface when a front-end application requests the interface from a back end;
the denoising module is used for carrying out regular matching denoising on the interface address through etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly;
and the classification module is used for acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface address corresponding to the same denoising interface address as a class of interface so as to remove a repeated interface address.
In addition, to achieve the above object, the present invention further provides an interface address based deduplication device, including: the system comprises a memory, a processor and an interface address based deduplication program stored on the memory and executable on the processor, wherein the interface address based deduplication program realizes the steps of the interface address based deduplication method as described above when executed by the processor.
In addition, to achieve the above object, the present invention further provides a readable storage medium, in which an interface address based deduplication program is stored, and the interface address based deduplication program implements the steps of the interface address based deduplication method when executed by a processor.
The interface address duplication eliminating method based on the interface address provided by the embodiment of the invention collects the interface address generated by the back end, matches the collected interface address through a regular expression in a streaming real-time calculation mode, and replaces a randomly generated id with a fixed character, so that different interface addresses of the same original interface are unified, and the statistical analysis of the related data of the interface is facilitated, such as: and accurately counting the number of calls and references of an original interface. The technical problem that a plurality of interface addresses of the same original interface cause difficulty in statistical analysis during statistics is solved.
Drawings
FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for deduplication based on interface addresses according to a first embodiment of the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a method for deduplication based on interface addresses according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The main solution of the embodiment of the invention is as follows: collecting interface addresses generated at the back end, matching the collected interface addresses through a regular expression in a streaming real-time calculation mode, and replacing random generation id with fixed characters, so that different interface addresses of the same original interface are unified.
Due to the API taking capability provided by the external in some occasions, such as a paas platform, the same API can generate various urls with different ids through dragging and pulling or calling, and the ids are generated through combination of numbers and letters and are essentially the same API. However, this situation causes troubles and even masks problems in the API data statistics and report analysis, which is not favorable for API data statistics and analysis.
The invention provides a solution, unify different interface addresses of the same original interface, facilitate the statistical analysis of the relevant data of the interface, such as: and accurately counting the number of calls and references of an original interface. The technical problem that the same original interface generates a plurality of interface addresses during statistics and causes difficulty in statistical analysis is solved.
As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.
The device of the embodiment of the invention can be a server, and can also be a mobile terminal device with a display function, such as a PC, a portable computer and the like.
As shown in fig. 1, the apparatus may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the device may also include a camera, RF (Radio Frequency) circuitry, sensors, audio circuitry, WiFi modules, and so forth. Such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that may turn off the display screen and/or the backlight when the mobile terminal is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the mobile terminal is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping) and the like for recognizing the attitude of the mobile terminal; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which are not described herein again.
Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an interface address-based deduplication program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call an interface address-based deduplication program stored in the memory 1005, and perform the following operations:
when a front-end application requests an interface from a back-end, acquiring an interface address of the interface;
performing regular matching denoising on the interface address through the etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly;
and acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interfaces to remove repeated interface addresses.
Further, the processor 1001 may call an interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
the step of obtaining the interface address of the interface comprises:
obtaining the interface address from the back end through a aggregator;
sending the interface address to a pulsar message queue through the aggregator;
and subscribing the etl as the interface address in the pulsar message queue by the consumer.
Further, the processor 1001 may call an interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
the step of generating a denoised interface address by performing regular matching denoising on the interface address through the etl comprises:
receiving the interface address sent by the pulsar message queue through the etl, and extracting an address uri from the interface address;
and matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address.
Further, the processor 1001 may call the interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
the step of writing the denoising interface address and the interface address into a preset storage area correspondingly comprises:
and taking the preset storage area as a consumer of the pulsar message queue to subscribe the denoising interface address and the interface address corresponding to the denoising interface address.
Further, the processor 1001 may call an interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
after the step of generating a denoised interface address by performing regular matching denoising on the interface address through the etl, the method further includes:
and carrying out encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a label of the interface address.
Further, the processor 1001 may call an interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
the step of performing encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a label of the interface address includes:
performing encryption operation on the denoising interface address through the etl based on an MD5 algorithm, and taking the result of the encryption operation as the interface fingerprint of the interface address;
and writing the interface fingerprint into the pulsar message queue as a label of the interface address, and taking the preset storage area as a consumer of the pulsar message queue to obtain the interface address and the interface fingerprint as the label of the interface address.
Further, the processor 1001 may call an interface address-based deduplication program stored in the memory 1005, and also perform the following operations:
the step of obtaining the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface address corresponding to the same denoising interface address as a class of interface includes:
acquiring the interface fingerprint generated according to the denoising interface address and the interface address corresponding to the denoising interface address from the preset storage area;
and classifying the interface addresses corresponding to the same interface fingerprints into a class of interfaces.
Referring to fig. 2, a first embodiment of the interface address-based deduplication method according to the present invention includes:
step S10, when the front-end application requests the interface to the back-end, the interface address of the interface is obtained;
it can be understood that, in this embodiment, the front-end application is an application scenario of this embodiment, when the back-end application receives an interface request, and when the same API interface faces different requesters, the url (Uniform Resource Locator) generated by the front-end application is also different, for example, a Platform as a Service (Platform as a Service) provides an original API address as follows: http:// www.mxxxyxxxyxxx.com/test/API/, and when a plurality of users need to call the API, for user a, the actual address interface address 1 providing the original API interface to user a is: http:// www.mxxxyxxxyxxx.com/test/api/123456/; for user b, the actual address provided to it is interface address 2: http:// www.mxxxyxxxyxxx.com/test/api/234566/; for user c, interface address 3 is provided: http:// www.mxxxyxxxyxxx.com/test/API/234354/, generally, the id (Identity document) added in the suffix of the original API address is generated by random combination of numbers or letters, the actually generated random number is more complicated, and the above example is simplified for convenience of description. And the generation of random numbers is also related to the time stamp, so besides the above-mentioned facing different users, the real address of the original API provided by the back-end platform is also different each time different time points are accessed, but essentially, the interface address 1, the interface address 2 and the interface address 3 are all the same API.
Further, acquiring the interface address from the back end through a collector; sending the interface address to a pulsar message queue through the aggregator; and subscribing the etl as the interface address in the pulsar message queue by the consumer.
The aggregator acquires the API interface addresses provided by the back end through the platform or server probe, such as the interface address 1, the interface address 2, and the interface address 3, which can be understood that the number of the interface addresses acquired in the actual application process is larger. The aggregator is used as a generation producer of a pulser message queue (a message middleware) to send a message to the pulser message queue, specifically, the aggregator sends the collected interface address to a subject 1 in the pulser message queue. Then, the etl (Extract-Transform-Load) function service is used as a consumer of the pulsar message queue, for example, the etl function service subscribes to a topic 1 in the pulsar message queue, and a plurality of interface addresses provided by the back end to the outside can be obtained from the topic 1.
Step S20, performing regular matching denoising on the interface address through etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly;
it is understood that the denoising is performed by adding id to the externally provided interface address suffix at the back end under different conditions, such as 123456 in the interface address 1http:// www.mxxxyxxxyxxx.com/test/api/123456/.
Further, the interface address sent by the pulsar message queue is received through the etl, and an address uri is extracted from the interface address; and matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address.
Specifically, for example, the etl service function acquires a uri (Uniform Resource Identifier) part in the interface address extraction interface address by subscribing to the topic 1 of the pulsar message queue, such as: if the interface address 1http:// www.mxxxyxxxyxxx.com/test/api/123456/part, the etl service function is regular matching, and the specific regular expression is as follows: \/([ a-z \ _ ]) + (question mark) \ d {2,20}, for the uri portion of each interface address by this regular expression: performing regular matching, wherein a specific matching rule is a random number between two '/'s in a matching object text, and replacing the random number with a preset character string such as: i.e. replacing the random number by the random number, and taking the interface address 1 as an example, the finally generated denoising interface address is: http:// www.mxxxyxxxyxxx.com/test/api/, similarly, the denoising processes of the interface address 2 and the interface address 3 are the same, and details are not repeated here, it can be understood that the text replacement of the (www.mxxxyxxxyxxx.com) domain name part can be avoided by performing regular expression matching for the uri part. In addition, the etl service function, besides being a consumer of the pulsar message queue, will also be a producer of the pulsar message queue, such as sending interface address 1 and denoise interface address 1 to the subject 2 of the pulsar message queue.
Furthermore, the preset storage area is used as a consumer of the pulsar message queue to subscribe the denoising interface address and the interface address corresponding to the denoising interface address.
And taking the preset storage area as a consumer subscription theme 2 of the pulsar message queue, and acquiring the interface address 1 and the denoising interface address 1 from the theme 2 to finish storage.
Step S30, obtaining the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interfaces to remove duplicate interface addresses.
It can be understood that, a large number of interface addresses and denoising interface addresses corresponding to the interface addresses are stored in the preset storage area, and the plurality of interface addresses may be the same original interface or different, but if the plurality of interface addresses are the same original interface, the denoising interface addresses after denoising the interface addresses are the same, as the denoising interface addresses obtained after denoising the interface address 1, the interface address 2 and the interface address 3 of the same original interface are: http:// www.mxxxyxxxyxxx.com/test/api/, therefore, the interface addresses with the same denoising interface address are classified into one, and the related data of one type of interface address is uniformly displayed, so that the effect of removing the repeated interface address is achieved.
In this embodiment, the interface address generated by the back end is collected, and in a form of streaming real-time calculation, the collected interface address is matched through a regular expression, and a randomly generated id is replaced by a fixed character, so that different interface addresses of the same original interface are unified, and statistical analysis of relevant interface data is facilitated, for example: and accurately counting the number of calls and references of an original interface. The technical problem that a plurality of interface addresses of the same original interface cause difficulty in statistical analysis during statistics is solved.
Further, referring to fig. 3, a second embodiment of the interface address-based deduplication method according to the present invention includes:
step S100, when a front-end application requests an interface from a back-end, acquiring an interface address of the interface;
further, acquiring the interface address from the back end through a collector; sending the interface address to a pulsar message queue through the aggregator; and subscribing the etl as the interface address in the pulsar message queue by the consumer.
Step S210, carrying out regular matching denoising on the interface address through etl to generate a denoising interface address;
further, the interface address sent by the pulsar message queue is received through the etl, and an address uri is extracted from the interface address; and matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address.
Step S220, carrying out encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint serving as a label of the interface address into a preset storage area;
further, performing encryption operation on the denoising interface address through the etl based on an MD5 algorithm, and taking a result of the encryption operation as the interface fingerprint of the interface address; and writing the interface fingerprint into the pulsar message queue as a label of the interface address, and taking the preset storage area as a consumer of the pulsar message queue to obtain the interface address and the interface fingerprint as the label of the interface address.
Specifically, the etl generates a denoising interface address by denoising the interface address by using a regular expression, and also uses the generated denoising interface address as an encryption factor, and performs encryption operation by using a MD5 (Message Digest MD 5) Algorithm to generate a character string corresponding to the denoising interface address, where the character string is an interface fingerprint of the interface address corresponding to the denoising interface address, it can be understood that, based on the above example, the interface address 1, the interface address 2, and the interface address 3 are denoised and then obtained denoising interface addresses are the same, therefore, the three interface addresses generate the denoising interface address by replacing the text with the regular matching, and then the interface fingerprints generated by using the MD5 Algorithm for the denoising interface address are the same, therefore, based on the interface address generated by the same original interface, the same interface fingerprint can be generated by the above processing, the generated interface fingerprint is also actually the interface fingerprint of the original interface. And the etl obtains an interface fingerprint by processing the received interface address, and correspondingly sends the interface fingerprint as a label corresponding to the interface address to the pulsar message queue.
And step S300, classifying the interface addresses corresponding to the same interface fingerprints into a class of interfaces so as to remove repeated interface addresses.
Before classifying the interface address, further, the interface fingerprint generated according to the denoising interface address and the interface address corresponding to the denoising interface address are obtained from the preset storage area.
It can be understood that, similarly, a large number of interface addresses and interface fingerprints corresponding to the interface addresses are stored in the preset storage area, the plurality of interface addresses may be the same original interface or different from each other, and if the plurality of interface addresses are the same original interface, the labels (interface fingerprints) of the interface addresses are also the same, so that actually, an original interface corresponds to a unique interface fingerprint, and whether the interface addresses belong to the same original interface is determined by determining whether the interface fingerprints of the interface addresses are the same, for example, the interface fingerprints obtained after the denoising and encrypting operation is performed on an original interface having the interface address 1, the interface address 2, and the interface address 3 are all: 972130B75066C825, therefore, a plurality of interface addresses are classified according to the interface fingerprints, for example, the interface addresses with the same interface fingerprint are classified into one, and the data related to one type of interface address is displayed uniformly, so as to achieve the effect of removing the repeated interface addresses.
In this embodiment, the interface address generated at the back end is collected, in a form of streaming real-time calculation, the collected interface address is matched through a regular expression, a fixed character is used to replace a randomly generated id to obtain a de-noising interface address, and then the same interface fingerprint is finally generated for different interface addresses of the same original interface through an interface fingerprint obtained by performing encryption operation on the de-noising interface address, that is, one original interface corresponds to one unique interface fingerprint, and the different interface addresses are classified according to the interface fingerprint, so that statistical analysis of interface related data is facilitated. The technical problem that the same original interface generates a plurality of interface addresses during statistics and causes difficulty in statistical analysis is solved.
In this embodiment, the same or similar contents as those in the first embodiment may refer to the above description, and are not repeated in this embodiment.
In addition, in this embodiment, an interface address based deduplication apparatus is further provided, where the interface address based deduplication apparatus includes:
the acquisition module is used for acquiring an interface address of an interface when a front-end application requests the interface from a back end;
the denoising module is used for carrying out regular matching denoising on the interface address through the etl to generate a denoising interface address, and correspondingly writing the denoising interface address and the interface address into a preset storage area;
and the classification module is used for acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface address corresponding to the same denoising interface address as a class of interface so as to remove a repeated interface address.
In addition, in this embodiment, an interface address based deduplication apparatus is further provided, where the interface address based deduplication apparatus includes: the system comprises a memory, a processor and an interface address based deduplication program stored on the memory and executable on the processor, wherein the interface address based deduplication program realizes the steps of the interface address based deduplication method as described above when executed by the processor.
In addition, a readable storage medium is provided in the present embodiment, and the readable storage medium stores an interface address based deduplication program, and the interface address based deduplication program implements the steps of the interface address based deduplication method when executed by a processor.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention or directly or indirectly applied to other related technical fields are also included in the scope of the present invention.

Claims (6)

1. An interface address-based deduplication method, wherein the interface address-based deduplication method comprises the following steps:
when a front-end application requests an interface from a back-end, acquiring an interface address of the interface;
performing regular matching denoising on the interface address through the etl to generate a denoising interface address, and writing the denoising interface address and the interface address into a preset storage area correspondingly;
acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interfaces to remove repeated interface addresses;
wherein the step of obtaining the interface address of the interface comprises:
obtaining the interface address from the back end through a aggregator;
sending the interface address to a pulsar message queue through the aggregator;
subscribing the etl as a consumer to the interface address in the pulsar message queue;
the step of generating a denoised interface address by performing regular matching denoising on the interface address through the etl comprises:
receiving the interface address sent by the pulsar message queue through the etl, and extracting an address uri from the interface address;
matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address;
after the step of generating a denoised interface address by performing regular matching denoising on the interface address through the etl, the method further includes:
carrying out encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a label of the interface address;
the step of obtaining the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface addresses corresponding to the same denoising interface address as a class of interface includes:
acquiring the interface fingerprint generated according to the denoising interface address and the interface address corresponding to the denoising interface address from the preset storage area;
and classifying the interface addresses corresponding to the same interface fingerprints into a class of interfaces.
2. The interface address-based deduplication method of claim 1, wherein the writing the denoising interface address and the interface address into a preset storage area correspondingly comprises:
and taking the preset storage area as a consumer of the pulsar message queue to subscribe the denoising interface address and the interface address corresponding to the denoising interface address.
3. The interface address-based deduplication method of claim 1, wherein the encrypting the denoising interface address by the etl generates an interface fingerprint of the interface, and the writing the interface fingerprint into a preset storage area as a tag of the interface address comprises:
performing encryption operation on the denoising interface address through the etl based on an MD5 algorithm, and taking the result of the encryption operation as the interface fingerprint of the interface address;
and writing the interface fingerprint into the pulsar message queue as a label of the interface address, and taking the preset storage area as a consumer of the pulsar message queue to obtain the interface address and the interface fingerprint as the label of the interface address.
4. An interface address based deduplication apparatus, wherein the interface address based deduplication apparatus comprises:
the acquisition module is used for acquiring an interface address of an interface when a front-end application requests the interface from a back end;
the denoising module is used for carrying out regular matching denoising on the interface address through the etl to generate a denoising interface address, and correspondingly writing the denoising interface address and the interface address into a preset storage area;
the classification module is used for acquiring the denoising interface address and the corresponding interface address from the preset storage area, and classifying the interface address corresponding to the same denoising interface address as a class of interface to remove a repeated interface address;
wherein the obtaining module is further configured to:
obtaining the interface address from the back end through a aggregator;
sending the interface address to a pulsar message queue through the aggregator;
subscribing the etl as a consumer to the interface address in the pulsar message queue;
wherein the denoising module is further configured to:
receiving the interface address sent by the pulsar message queue through the etl, and extracting an address uri from the interface address;
matching the identification text in the address uri by using a preset regular formula, and replacing the identification text with preset characters to generate a denoising interface address;
wherein the denoising module is further configured to:
carrying out encryption operation on the denoising interface address through the etl to generate an interface fingerprint of the interface, and writing the interface fingerprint into a preset storage area as a label of the interface address;
wherein the classification module is further configured to:
acquiring the interface fingerprint generated according to the denoising interface address and the interface address corresponding to the denoising interface address from the preset storage area;
and classifying the interface addresses corresponding to the same interface fingerprints into a class of interfaces.
5. An interface address based deduplication apparatus, wherein the interface address based deduplication apparatus comprises: a memory, a processor, and an interface address based deduplication program stored on the memory and executable on the processor, the interface address based deduplication program, when executed by the processor, implementing the steps of the interface address based deduplication method of any of claims 1-3.
6. A readable storage medium, having stored thereon an interface address based deduplication program, which when executed by a processor implements the steps of the interface address based deduplication method of any one of claims 1 through 3.
CN202210007597.7A 2022-01-06 2022-01-06 Interface address based duplicate removal method, device, equipment and readable storage medium Active CN114020651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210007597.7A CN114020651B (en) 2022-01-06 2022-01-06 Interface address based duplicate removal method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210007597.7A CN114020651B (en) 2022-01-06 2022-01-06 Interface address based duplicate removal method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN114020651A CN114020651A (en) 2022-02-08
CN114020651B true CN114020651B (en) 2022-05-27

Family

ID=80069808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210007597.7A Active CN114020651B (en) 2022-01-06 2022-01-06 Interface address based duplicate removal method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114020651B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984753A (en) * 2014-05-28 2014-08-13 北京京东尚科信息技术有限公司 Method and device for extracting web crawler reduplication-removing characteristic value
EP3086618A1 (en) * 2015-04-23 2016-10-26 Thomson Licensing Repeating method and corresponding communication network device, system, computer readable program product and computer readable storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130003600A1 (en) * 2011-06-29 2013-01-03 International Business Machines Corporation Configuration of Interfaces Communicatively Coupled to Link-Local Zones in a Network
CN106844389B (en) * 2015-12-07 2021-05-04 阿里巴巴集团控股有限公司 Method and device for processing URL (Uniform resource locator)
US10764239B2 (en) * 2018-11-28 2020-09-01 Vmware, Inc. Link local address assignment for interfaces of overlay distributed router
CN110995672B (en) * 2019-11-20 2023-09-01 天津大学 Network security authentication method for software development
CN113612306A (en) * 2020-05-18 2021-11-05 海南美亚电能有限公司 Distributed power distribution cabinet and control system thereof
CN111601314B (en) * 2020-05-27 2023-04-28 北京亚鸿世纪科技发展有限公司 Method and device for double judging bad short message by pre-training model and short message address
CN112287684B (en) * 2020-10-30 2024-06-11 中国科学院自动化研究所 Short text auditing method and device for fusion variant word recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984753A (en) * 2014-05-28 2014-08-13 北京京东尚科信息技术有限公司 Method and device for extracting web crawler reduplication-removing characteristic value
EP3086618A1 (en) * 2015-04-23 2016-10-26 Thomson Licensing Repeating method and corresponding communication network device, system, computer readable program product and computer readable storage medium

Also Published As

Publication number Publication date
CN114020651A (en) 2022-02-08

Similar Documents

Publication Publication Date Title
US20200167314A1 (en) System and method for concepts caching using a deep-content-classification (dcc) system
US10084869B2 (en) Metering user behaviour and engagement with user interface in terminal devices
US11870741B2 (en) Systems and methods for a metadata driven integration of chatbot systems into back-end application services
JP2017504121A5 (en)
CN107861981B (en) Data processing method and device
CN113568626B (en) Dynamic packaging and application package opening method and device and electronic equipment
CN102984161A (en) Identification method and device for reliable website
CN112115113B (en) Data storage system, method, device, equipment and storage medium
US10372746B2 (en) System and method for searching applications using multimedia content elements
CN111177623A (en) Information processing method and device
CN107748772B (en) Trademark identification method and device
CN110245291A (en) A kind of display methods of business datum, device, computer equipment and storage medium
CN109902726B (en) Resume information processing method and device
CN108011936B (en) Method and device for pushing information
CN104683496A (en) Address filtering method and device
US11620327B2 (en) System and method for determining a contextual insight and generating an interface with recommendations based thereon
CN114024839B (en) Server log message classification method, device, equipment and readable storage medium
CN114020651B (en) Interface address based duplicate removal method, device, equipment and readable storage medium
CN102984162A (en) Identifying method and collecting system for credible websites
CN113274736B (en) Cloud game resource scheduling method, device, equipment and storage medium
CN110334763B (en) Model data file generation method, model data file generation device, model data file identification device, model data file generation apparatus, model data file identification apparatus, and model data file identification medium
CN112486796B (en) Method and device for collecting information of vehicle-mounted intelligent terminal
CN111931465B (en) Method and system for automatically generating user manual based on user operation
CN113780318B (en) Method, device, server and medium for generating prompt information
CN113362069A (en) Dynamic adjustment method, device and equipment of wind control model and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant