WO2023071729A1 - 管理推荐策略的方法和装置 - Google Patents

管理推荐策略的方法和装置 Download PDF

Info

Publication number
WO2023071729A1
WO2023071729A1 PCT/CN2022/123907 CN2022123907W WO2023071729A1 WO 2023071729 A1 WO2023071729 A1 WO 2023071729A1 CN 2022123907 W CN2022123907 W CN 2022123907W WO 2023071729 A1 WO2023071729 A1 WO 2023071729A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
recommendation
des
target application
ttp
Prior art date
Application number
PCT/CN2022/123907
Other languages
English (en)
French (fr)
Inventor
梁宇明
金敬亭
胡星海
潘鲁宁
叶剑烨
汪天一
陈兴修
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2023071729A1 publication Critical patent/WO2023071729A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/629Protecting access to data via a platform, e.g. using keys or access control rules to features or functions of an application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/541Interprogram communication via adapters, e.g. between incompatible applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general

Definitions

  • Various implementations of the present disclosure relate to the computer field, and more specifically, to a method, device, device, and computer storage medium for managing recommendation strategies.
  • Such a global application may need to provide services to users in multiple different regions based on the same technical architecture.
  • these areas may have completely different data security constraints, such as specific data sovereignty protection requirements, which further increases the difficulty of data security protection.
  • a method of managing recommendation strategies includes: acquiring a set of object features associated with a set of objects in the target application, the set of object features is converted based on the attributes of the set of objects, and the set of object features does not directly express the attributes of the set of objects; Determining a first object feature and a second object feature from a set of object features, the first difference between the first object feature and the second object feature being less than a first threshold; A first recommendation result corresponding to the feature and a second recommendation result corresponding to the second object feature; and evaluating a recommendation strategy based on the first recommendation result and the second recommendation result.
  • an apparatus for managing recommendation strategies includes: an acquisition module configured to acquire a set of object features associated with a set of objects in the target application, the set of object features is converted based on the attributes of a set of objects, and the set of object features is not directly expressed Attributes of a set of objects; a selection module configured to determine a first object characteristic and a second object characteristic from the set of object characteristics, the first difference between the first object characteristic and the second object characteristic being less than a first threshold; determining A module configured to determine a first recommendation result corresponding to the first object feature and a second recommendation result corresponding to the second object feature based on the recommendation strategy in the target application; and an evaluation module configured to determine the first recommendation result based on the and a second recommendation result, evaluating the recommendation strategy.
  • an electronic device including: a memory and a processor; wherein the memory is used to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of the first aspect.
  • a computer-readable storage medium on which one or more computer instructions are stored, wherein one or more computer instructions are executed by a processor to implement the method according to the first aspect of the present disclosure .
  • a computer program product comprising one or more computer instructions, wherein the one or more computer instructions are executed by a processor to implement the method according to the first aspect of the present disclosure.
  • FIG. 1 shows a schematic block diagram of a data security protection system according to an embodiment of the present disclosure
  • Figure 2 shows a schematic block diagram of a computing security subsystem according to some embodiments of the present disclosure
  • FIG. 3A illustrates an example deployment environment in which a data exchange subsystem is deployed according to some embodiments of the present disclosure
  • Figure 3B shows the implementation of DES in the TTP's internal data center (IDC) and the non-TTP's overseas internal data center (RoW IDC) according to some embodiments of the present disclosure
  • Figure 3C shows a block diagram of an example architecture of a DE according to some embodiments of the present disclosure
  • Figure 3D shows a flow chart of a data exchange process according to some embodiments of the present disclosure
  • FIG. 3E shows a flowchart of an example data flow of various types of data processing implemented at the DES according to some embodiments of the present disclosure
  • FIG. 3F shows a schematic block diagram of a data exchange architecture involving an MQ channel according to some embodiments of the present disclosure
  • Figure 3G shows a schematic block diagram of a data exchange architecture involving HDFS channels according to some embodiments of the present disclosure
  • FIG. 3H shows a schematic diagram of a target object store (TOS) channel where data is copied from a TTP IDC to an overseas IDC according to some embodiments of the present disclosure
  • Figure 3I shows a schematic diagram of a TOS channel where data is copied from an overseas IDC to a TTP IDC according to some embodiments of the present disclosure
  • Figure 3J shows a message sequence diagram in a TOS channel according to some embodiments of the present disclosure
  • FIG. 3K shows a schematic block diagram of a data exchange architecture involving a service invocation channel according to some embodiments of the present disclosure
  • 3L shows an example of data exchange from non-TTP to TTP in a service invocation channel according to some embodiments of the present disclosure
  • 3M shows an example of data exchange from TTP to non-TTP in a service invocation channel according to some embodiments of the present disclosure
  • FIG. 4A shows a flowchart of a method for managing network traffic of a mobile terminal application according to some embodiments of the present disclosure
  • FIG. 4B shows a schematic diagram of an analysis and restriction process for native-type network traffic according to some embodiments of the present disclosure
  • FIG. 4C shows a schematic diagram of a process of analyzing and limiting network traffic of a webpage view type according to some embodiments of the present disclosure
  • FIG. 4D shows a schematic diagram of a process of analyzing and restricting network traffic of a third-party SDK type according to some embodiments of the present disclosure
  • Figure 4E shows a block diagram of a security sandbox system according to some embodiments of the present disclosure
  • FIG. 5 shows a flowchart of an example process of managing recommendation policies according to some embodiments of the present disclosure
  • FIG. 6 shows an example block diagram of an apparatus for managing recommendation strategies according to some embodiments of the present disclosure.
  • FIG. 7 shows a block diagram of an example device that may be used to implement embodiments of the present disclosure.
  • a data security protection system is provided.
  • Fig. 1 shows a schematic block diagram of a data security protection system 1000 according to an embodiment of the present disclosure.
  • the data security protection system 1000 includes multiple subsystems, which are used to protect the security of relevant data generated by a user in the process of using a target application from different dimensions.
  • the user in order to support the running of the target application, on the one hand, the user needs to be able to run the target application 1080 , for example, through a suitable electronic device.
  • the target application platform 1030 in an appropriate computing environment (eg, cloud computing environment), for example, to run various types of services for supporting the normal operation of the target application 1080 .
  • the data security protection system 1000 may first ensure the security of the data generated during the running of the target application 1090 from the perspective of running code security. As shown in FIG. 1 , the data security protection system 1000 may include a secure computing subsystem 1060 , which may be used to ensure the security of the code corresponding to the target application 1080 and the security of the code corresponding to the target application platform 1030 .
  • the service running file compiled by the computing subsystem 1060 can be deployed to the target application platform 1030 , and the installation file (for example, apk file) of the target application compiled by the computing subsystem 1060 can be published to the application store 1120 , for example.
  • the installation file for example, apk file
  • the specific implementation of the secure computing subsystem 1060 will be discussed in detail below in conjunction with FIG. 2 .
  • secure computing subsystem 1060 may be based on cloud infrastructure 1070 .
  • cloud infrastructure 1070 may be provided, for example, by a trusted partner.
  • a "trusted partner” may also be referred to as a trusted technology partner (Trusted Technology Partner, TTP), which may include, for example, any individual or enterprise that is technically trusted in a specific area (for example, a specific country or jurisdiction) or organization.
  • TTP trusted Technology Partner
  • the data security protection system 1000 may include a trusted security environment 1010 provided by TTP.
  • the target application platform 1030 can be deployed in the trusted security environment 1010 to improve the security of data generated by the target application platform 1030, as well as the transparency and credibility of its operating mechanism.
  • the target application 1080 may provide content recommendation services for users through recommendation algorithms.
  • content recommendations may include, but are not limited to, for example: multimedia content recommendations, user recommendations, commodity recommendations, and the like.
  • content recommendations may include, but are not limited to, for example: multimedia content recommendations, user recommendations, commodity recommendations, and the like.
  • the data security protection system 1000 may also include a recommendation management subsystem 1050 , which may, for example, test the recommendation algorithm run by the target application platform 1030 to ensure the fairness of the recommendation mechanism in the target application 1080 .
  • a recommendation management subsystem 1050 may, for example, test the recommendation algorithm run by the target application platform 1030 to ensure the fairness of the recommendation mechanism in the target application 1080 .
  • the specific implementation of the recommendation management subsystem 1050 will be described in detail below.
  • the target application platform 1030 may require applications or data outside the target area (for example, a specific country or jurisdiction) where it is currently deployed.
  • Center also known as overseas application or overseas data center
  • the target area usually restricts the data generated in the area and the communication outside the country through laws or regulations. Certain types of data generated within the target area may be prohibited from being transferred abroad.
  • the data security protection subsystem may include a data exchange subsystem 1040 .
  • the data exchange subsystem 1040 can be deployed in the trusted security environment 1010 to ensure the transparency and credibility of its operation.
  • the data exchange subsystem 1040 may include multiple data channels for different types of data transmission.
  • the multimedia data generated in the target application platform 1030 can communicate with the overseas application 1140 and/or the overseas data center 1150 via the content distribution network 1130 provided by a third party through the corresponding data channel in the data exchange subsystem 1040 .
  • the target application platform 1030 can communicate with the overseas data center 1150 and the overseas development department 1160 through corresponding data channels, such as through direct optical cables.
  • the specific implementation of the data exchange subsystem 1040 will be described in detail below in conjunction with FIG. 3A to FIG. 3M .
  • the data security subsystem 1000 may further include an application firewall subsystem 1020 .
  • Application firewall subsystem 1020 may be deployed in trusted security environment 1010, which may be used, for example, to monitor data communication from target application 1080 to target application platform 1030, data communication from target application platform 1030 to target application 1080, and/or Or data communication from the target application platform 1030 to the third-party application 110, etc.
  • the data security protection platform 1000 can not only ensure the security and compliance of data communication between the target application platform 1030 and overseas through the data exchange subsystem 1040, but also ensure the target application platform through the application protection wall subsystem 1020. 1030 security and compliance of communications with various objects within the territory (eg, target application 1080 or third party application 1110, etc.).
  • the data security protection system 1000 may also include, for example, a security sandbox system 1090 managed by TTP, which enables the target application 1080 Different types of network communications involved in the application business logic 1100 can be protected by the security sandbox system 1090 . In this manner, the data security protection system 1000 can prevent the target application 1080 from initiating non-compliant data communication, for example, through a backdoor program or the like.
  • the detailed implementation of the security sandbox system 1090 will be described in detail below in conjunction with FIG. 4A to FIG. 4E .
  • TTP can manage and monitor various aspects such as code security and data security during the entire life cycle from the development of the target application to its operation, so as to ensure that all aspects related to the target application The security of connected data and ensure the compliance of its operation.
  • FIG. 2 shows a schematic block diagram of a secure computing subsystem 1060 according to an embodiment of the disclosure.
  • the secure computing subsystem 1060 may include, for example, a secure code environment 2010, which may be provided, for example, by TTP. The working process of the secure computing subsystem 1060 will be described below in conjunction with submitting new development code 2140 .
  • the development code 2140 when a developer has new development code 2140 to be deployed, he may submit the development code 2140 to the secure code environment 2010 through, for example, a synchronization gateway 2150 provided by TTP. Accordingly, the developed code 2140 will be synchronized to the code repository 2160 in the secure code environment 2010 .
  • the developer when the developer needs to use the new development code 2140 to compile, the developer can send a build request to the product build system 2080 through the synchronization gateway 2150 , for example.
  • the code repository 2160 can also automatically send a code merge event to the product construction system 2080, so as to trigger the product construction system 2080 to start the product (artifact, such as executable code ) build process.
  • a code merge event to the product construction system 2080, so as to trigger the product construction system 2080 to start the product (artifact, such as executable code ) build process.
  • the code pulling module 2090 can obtain the code files used for building from the code repository 2160 .
  • the code file used for building may be specified by the developer, or automatically determined by the artifact building system 2080 .
  • the compiling module 2100 may compile the code pulled by the code pulling module 2090 from the code library 2160, for example, to compile it into an intermediate code.
  • the secure computing subsystem 1060 also needs to ensure the security of the imported third-party codes.
  • the secure computing subsystem 1060 may include a third-party independent gateway 2030 for checking and confirming the security of the imported third-party library 2020 .
  • a third-party library may also be, for example, a compiled link library or the source code itself.
  • Third-party libraries 2020 that pass security checks may be added to artifact libraries 2040 .
  • the compiling module 2100 can also obtain other products on which the current product is compiled from the product library 2040 , such as products that have been compiled and generated in history, or products generated based on the third-party library 2020 wait.
  • the compiling module 2100 can compile the code pulled from the code library 2160 and the dependent products obtained from the product library 2040 to generate intermediate code, so that the security code scanning module 2110 can perform code security detection.
  • the security code scanning module 2110 managed by the TTP can perform any appropriate code scanning process to perform security checks, such scanning rules are unknown to the developer, thereby ensuring that the code used to compile the final product is code security.
  • the uploading module 2120 may perform corresponding uploading according to the result of the security code scanning module 2110 . If the security code scanning module 2110 determines that the compiled intermediate code is safe, the upload module 2120 may upload the further compiled executable file to the product library 2040 .
  • the upload module 2120 may also upload the signature information of the executable file to the product signature management module 2060 .
  • the upload module 2120 can upload the relevant risk to the problem tracking system 2070, for example, to form a risk analysis report.
  • the compiled executable file will be prohibited from being uploaded to the product repository 2040 .
  • the developed code 2140 in the code repository 2160 may also be provided, eg, in a trusted environment, eg, for human review. If it is determined that the development code 2140 is at risk, the result may also be reported to the issue tracking system 2070 .
  • the upload module 2120 may also notify the callback module 2130 to mark the corresponding code as a risk code in the code library 2160 .
  • the issue tracking system 2070 maintained by TTP may, for example, send the received risk reporting information to the developer or maintainer of the development code 2140, to remind him that the current development code 2140 cannot pass the security check, so it cannot be deploy.
  • the development code 2140 may be compiled into an executable file and further added to the artifact repository 2040 to be deployed, eg, via the deployment gateway 2050 .
  • the deployment gateway 2050 can verify whether the artifact's signature is valid through the artifact signature management system 2060 . After the validity of the signature of the artifact is confirmed, the deployment gateway 2050 can deploy the artifact generated based on the development code 2140 to the network.
  • the product may be, for example, an application program executed on the client device, and the deployment gateway 2050 may, for example, publish the generated installation file (eg, apk file) to a corresponding application store for users to download. Therefore, the embodiments of the present disclosure can ensure that the installation files that users can download and install are always released by the secure code environment 2010 via the deployment gateway 2050 .
  • the deployment gateway 2050 may, for example, publish the generated installation file (eg, apk file) to a corresponding application store for users to download. Therefore, the embodiments of the present disclosure can ensure that the installation files that users can download and install are always released by the secure code environment 2010 via the deployment gateway 2050 .
  • the artifact may be, for example, a service program for deployment into the target application platform 1030 .
  • the maintainer of the target application may initiate a request to the deployment platform to deploy a specific artifact to the target application platform 1030 .
  • the target application platform 1030 can acquire the specific artifact to be deployed from the artifact repository 2040 and authenticate the signature of the specific artifact.
  • the product can be deployed to the target application platform 1030 in the form of a virtual machine or a container, for example.
  • the embodiments of the present disclosure can effectively monitor the application program or program that is converted from code to actual deployment and use from various links such as code upload, code writing, code compilation, and third-party library reference. The process of the service program. Based on this approach, the embodiments of the present disclosure can effectively avoid various security loopholes or compliance risks introduced in the source code.
  • the operation of the application will involve data exchange between application platforms under the jurisdiction of different countries and regions.
  • the data exchange subsystem (DES) 1040 can support the synchronization of the public data of the target application and other data satisfying the rules between different platforms, and ensure the security and compliance of the exchanged data.
  • DES 1040 is configured to detect whether data between different platforms satisfies data exchange constraints.
  • Data exchange constraints may include constraints set in order to meet national or regional laws and regulations, constraints that need to be set due to the requirements of other aspects of enterprise, organization, and/or user protection, and so on.
  • the TTP may conduct inspections involving data sovereignty protection. Therefore, there is a need to protect the security and compliance of data exchange in many situations involving cross-platform data exchange. Especially after the TTP computer room is set up, the data exchange between the outside world and the TTP computer room will be restricted. It is hoped that the data exchanged with the TTP party will be checked for data sovereignty protection.
  • data exchange constraints may include rules related to data sovereignty protection requirements of a particular country or region.
  • This type of interaction data can be divided into two aspects.
  • One aspect includes interoperability data between platforms, and the other includes operation and maintenance data such as access or operation of the platform by platform operation and maintenance personnel.
  • Interoperable data is mainly used to synchronize between the two platforms to ensure the functional integrity of the application. This type of data needs to go through the DES system for security and compliance checks.
  • Interoperability data includes, for example, online business data, offline data, and the like. The inspection of operation and maintenance data is to ensure that the operations of operation and maintenance personnel on the operation and maintenance control plane are also compliant.
  • FIG. 3A illustrates an example deployment environment 3001 in which DES 1040 is deployed, according to some embodiments of the present disclosure.
  • TTP party 3027 refers to an environment that needs to be supervised and restricted by TTP in a specific country or region.
  • the TTP party 3027 may involve various components for running, managing, and maintaining the target application, including, for example, a business system 3028 , an operation platform 3029 , an online storage 3030 , an offline storage 3031 , and the like.
  • the TTP party 3027 also includes an operation and maintenance platform 3032, and operation and maintenance personnel will need to access the operation and maintenance platform 3032 to realize access, management or maintenance of target applications.
  • a non-TTP party 3020 refers to an environment belonging to one or more other countries or regions outside a specific country or region, which is not subject to the data exchange constraints of the country or region where the TTP party 3027 is located.
  • the non-TTP party 3020 may involve various components for running, managing, and maintaining the target application, including, for example, a business system 3020 , an operating platform 3021 , an online storage 3022 , and an offline storage 3023 .
  • the non-TTP party 3020 also includes an operation and maintenance platform 3024, and operation and maintenance personnel will need to access the operation and maintenance platform 3024 to implement access, management or maintenance of local applications or application platforms.
  • domestic user traffic will flow through some components of the TTP party 3027, and overseas user traffic will flow through some components of the non-TTP party 3020.
  • domestic user traffic refers to user traffic generated on the application platform under the jurisdiction of the specific country or region
  • reverseas user traffic refers to one or more other application platforms outside the specific country or region. User traffic generated on application platforms governed by countries or regions.
  • interworking data includes domestic user traffic and overseas user traffic exchanged between the TTP party and the non-TTP party. Interoperable data will go through DES 1040 for data security and compliance checks.
  • an operation gateway 3026 can also be set to perform data security and compliance checks on the operation and maintenance data.
  • FIG. 3A schematically shows some channels, including target object store (TOS) channel, message queue (MQ) channel, offline aggregation data channel, log (LOG) channel, service call channel and so on.
  • TOS target object store
  • MQ message queue
  • LOG log
  • service call channel service call channel
  • Both parties of data exchange may have their own DES to realize data protection, for example, to protect incoming data and/or outgoing data.
  • FIG. 3B further shows the implementation of DES 1040 in the TTP's internal data center (IDC) and the non-TTP's overseas internal data center (RoW IDC).
  • IDC TTP's internal data center
  • RoW IDC non-TTP's overseas internal data center
  • TTP IDC 3056 refers to the IDC for the target application running in a specific country or region, which is subject to the data protection detection of TTP
  • overseas IDC 3059 refers to one or more applications outside the specific country or region IDCs running target applications in other countries or regions may be subject to data protection restrictions in other countries or regions.
  • DES 1040A is implemented in TTP IDC 3056 to detect externally incoming and/or internally outgoing data.
  • DES 1040B is implemented in the foreign IDC 3059 to detect externally incoming and/or internally outgoing data.
  • DES 1040A and DES 1040B may be considered as specific deployment instances of DES 1040.
  • the data flowing in from the outside or the data flowing out from the inside can include various types of data, and examples will be described below.
  • external inflow data may include user requests, for example, an active request initiated by a user from a specific country or region through a target application 3058 running in the country.
  • the user request can also be secured through the mobile sandbox and/or the firewall gateway 3057 in the TTP IDC 3056, etc.
  • the user request will reach the domestic application platform 3041 in the TTP IDC 3056 for further processing.
  • the on-premises application platform 3041 may include various services, provider gateways, storage, and the like components.
  • the user request will be passed to DES 1040A for data protection.
  • the external incoming data may also include a supplier request initiated by the supplier 3055, for example, a request for a specific service of the domestic application platform.
  • a third-party provider may call the application program interface (API) of the domestic application platform, such as OpenAPI. Since it cannot be confirmed whether the third-party supplier is a domestic user, the supplier's request will be sent to the supplier gateway in the domestic application platform 3041 via the third-party gateway 3040 in the TTP IDC 3056 for inspection to determine whether it is a domestic user. If the supplier who initiates the request is a domestic user, the supplier request can be responded normally. If the supplier who initiates the request is an overseas user, the supplier request will be transmitted through DES1040A.
  • API application program interface
  • the external inflow data may also include data synchronized from the overseas IDC 3059 to the TTP IDC 3056.
  • the overseas inflow data also needs to be processed by DES 1040A.
  • the external inflow data may also include the operation and maintenance operations of the TTP IDC 3056 by the operation and maintenance personnel, such as changes to the TTP IDC 3056.
  • Such operations can include code class changes, configuration class changes, log maintenance, etc.
  • Code type changes may include, for example, the launch of new functions, the release of bin files, and the like.
  • Code changes can be performed by domestic operation and maintenance personnel of the application platform in the country or region.
  • Configuration changes may include enabling or disabling some settings of the target application, scheduled traffic configuration, and the like.
  • configuration changes can be performed by overseas platform operation and maintenance personnel. Of course, this just depends on the management requirements of different applications.
  • Log maintenance refers to maintaining log 3044 in TTP IDC 3056.
  • domestic operation and maintenance personnel or overseas operation and maintenance personnel can perform operation and maintenance operations on the TTP IDC 3056 under the condition of network isolation, so as to further ensure the protection of data sovereignty.
  • domestic operation and maintenance personnel initiate operation and maintenance operations in the case of network isolation, and the operation and maintenance operations will be distributed through the load balancer 3045 to be distributed to the code 3042 in the TTP IDC 3056 and the operation and maintenance platform 3043 or log at 3044.
  • the operation and maintenance operations of overseas operation and maintenance personnel will go through the operation gateway 3046 for further security checks, and then be distributed to the code 3042 in the TTP IDC 3056, the operation and maintenance platform 3043 or the log 3044.
  • the internal outflow data may include a third-party request initiated from the domestic application platform 3041 during the operation of the application platform to request a third-party service 3054, for example, in the public network Third Party Services. Third party requests also require data protection by DES 1040A.
  • the internal outgoing data may also include data synchronized from the TTP IDC 3056 to the overseas IDC 3059.
  • data synchronized from the TTP IDC 3056 to the overseas IDC 3059 For example, when the target application is running, it may be necessary to synchronize the user content stored in the TTP IDC 3056 to the overseas IDC 3059. According to some regulations on data sovereignty protection, this type of data may be the key data that DES1040A needs to review.
  • the internal outgoing data may also include code synchronization data.
  • code review of the target application or application platform may be required due to inspection requirements such as data sovereignty protection.
  • the code may be synchronized to the security isolation environment 3051 for review.
  • the security isolation environment 3051 may be, for example, a computer room not connected to the Internet, a physical environment such as a monitored computer room, or a virtual computing environment with security protection, and so on.
  • the DES 1040B deployed in it will also provide security protection for similar external inflow data and internal outflow data.
  • the user request generated by the user through the target application 3058 running overseas can also be protected by the DES 1040B after reaching the overseas application platform 3048 (which may include various services and storage) via the load balancer 3047 .
  • overseas application platform 3048 which may include various services and storage
  • overseas operation and maintenance personnel can also perform operation and maintenance operations on the overseas application platform 3048 through the crystal gateway 3049 in the case of network isolation. This kind of operation and maintenance operation can also carry out data protection through DES 1040B.
  • DES 1040 for example, DES 1040A or 1040B
  • data can be preprocessed according to the type of data, so as to format the data uniformly, thereby simplifying and facilitating subsequent data sovereignty Protected checks speed up the data exchange process.
  • DES 1040 can be divided into different processing parts according to data types.
  • DES 1040A may include domestic user data channels for processing data related to domestic users in specific countries or regions; overseas user data channels for processing data related to overseas users; engineering technology data channels for Process engineering and operation and maintenance data, such as codes, parameters and other research and development data, operation and maintenance data, etc.
  • the data in each channel can be further divided.
  • data in different channels can be classified as one or more of Message Queue (MQ) data, Offline Aggregated Data, Target Object Store (TOS) data, Service Invocation data, or other types of data.
  • MQ Message Queue
  • TOS Target Object Store
  • the data in the unified format can be converted back to the data in the original format and provided to the corresponding destination.
  • different types of data have different data formats and processing technologies.
  • the follow-up audit stage of data sovereignty protection can be reduced. the complexity.
  • only the pre-processing and post-processing of data need to be changed, without complex changes to the processing of the data exchange constraint determination stage.
  • the data exchange architecture has great flexibility and scalability.
  • FIG. 3C shows a block diagram of an example architecture of DES 1040 according to some embodiments of the present disclosure.
  • the DES 1040 is shown as synchronizing data between the domestic application platform 3041 and the external application platform (collectively referred to as the overseas application platform 3048 ) for the target application, and performing the determination of data exchange constraints.
  • DES 1040 may include DES adapter 3061, DES hub and DES adapter 3070.
  • DES centers may include DES centers for different types of data channels, such as DES center 3065A for domestic user data, DES center 3065B for overseas user data, and DES center 3065C for engineering technical data, etc.
  • DES centers 3065A, 3065B and 3065C have different synchronization capabilities.
  • the DES centers 3065A, 3065B, and 3065C may be collectively referred to as the DES centers 3065 .
  • the DES adapter 3061 is connected to the domestic application platform 3041, and is used to receive data to be synchronized from the domestic application platform 3041 and to be detected by the DES 1040, and to send data received from the overseas application platform 3048 and detected by the DES 1040 to the domestic application platform 3041.
  • the DES adapter 3070 is connected with the overseas application platform 3048, and is used for sending the data received by the domestic application platform 3041 and detected by the DES 1040 to the overseas application platform 3048, and receiving data to be synchronized and detected by the DES 1040 from the overseas application platform 3048 .
  • Both DES adapter 3061 and DES adapter 3070 are interconnected with DES center 3065 to transmit data to DES center 3065.
  • Each DES center 3065 is configured to inspect data with data exchange constraints to ensure security and compliance of data exchanged between two application platforms. Generally, data that meets the data exchange constraints will be delivered to the corresponding destination through DES 1040, while data that does not meet the data exchange constraints may be rejected by DES 1040.
  • the DES adapters 3061 and 3070 can be configured to perform pre-processing and post-processing on the data to be transmitted to the DES center 3065, so that the DES center 3065 can perform data exchange constraints based on the unified formatted data corresponding to each data type. ok.
  • the DES adapter 3061 and the DES center 3065 in the DES 1040 can be implemented in the TTP IDC 3056 together with the domestic application platform 3041, and the DES adapter 3070 can be implemented in the overseas IDC 3059 together with the overseas application platform 3048.
  • different components within DES 1040 may be isolated to further ensure more effective data isolation.
  • data isolation can be achieved by deploying different components in different data centers.
  • data isolation can be achieved by applying virtual private data center (VPC) technology.
  • VPC virtual private data center
  • DES adapter 3061 may be implemented in VPC1
  • each DES center may be implemented in VPC2
  • DES adapter 3070 may be implemented in VPC3.
  • Determination of data security and compliance in DES center 3065 may be performed by TTP.
  • VPC1 and VPC3 do not have a direct communication connection, but VPC1 and VPC3 have direct communication connections with VPC2 respectively, and can communicate data/information with each other.
  • the DES center 3065 deployed in VPC2 can be a TTP trusted area (called TTP trusted area).
  • the DES adapter 3061 can include a DES entry 3062, which can realize the processing of the control plane, such as application for establishment and management of data channels by operation and maintenance personnel, registration rules, etc., and the data in the channel can be viewed by TTP.
  • the DES adapter 3061 may also include a DES proxy (proxy) 3063, which may implement data plane processing, such as data verification, data filtering, data conversion, data sampling, log detection, and so on.
  • the DES adapter 3070 may be a DES ingress 3072 for the control plane and a DES proxy 3073 for the data plane.
  • the DES center 3065A may include a DES registry for registering data exchange constraints, configuration data, and the like.
  • the DES center 3065A can also include channels that are further subdivided, including a service call channel for service call data, an MQ channel for MQ data, an HDFS channel for offline aggregated data (where HDFS is called the Hadoop distributed file system) and a channel for TOS channel for TOS data.
  • Offline aggregate data includes, for example, Highly Parallel Integrated Virtual Environment (HIVE) type data.
  • HIVE Highly Parallel Integrated Virtual Environment
  • the service invocation data may include, for example, data for remote service invocation using various network protocols or invocation protocols, such as the HTTP protocol or the RPC protocol.
  • the MQ data may include data supporting the MQ protocol and similar protocols, for example, data stored in various databases (eg, MySQL, Redis database).
  • Offline aggregated data may include data in file systems based on HDFS technology and data in file systems based on other technologies.
  • TOS data includes object files such as video, audio, images, documents, and other media files.
  • the DES center 3065B for the foreign user data channel and the DES center 3065C for the engineering data channel may also include similar components to the DES center 3065A.
  • FIG. 3D shows a flowchart of a data exchange process 300 according to some embodiments of the present disclosure.
  • Process 3004 may be implemented in DES 1040.
  • the DES 1040 obtains raw data to be exchanged between the first platform (eg, domestic application platform 3041) and the second platform (eg, overseas application platform 3048) by the target application.
  • first platform eg, domestic application platform 3041
  • second platform eg, overseas application platform 3048
  • raw data may come from the first platform and may be received by DES adapter 3061 in DES 1040.
  • raw data can come from a second platform and can be received by DES adapter 3070 in DES 1040.
  • the DES 1040 processes the raw data based on its type to obtain uniformly formatted data corresponding to that type.
  • the processing of the original data (the processing here may also be referred to as pre-processing) can be determined according to the type of the original data.
  • the type of raw data may include, for example, MQ data, offline aggregation data, TOS data, or service call data.
  • the processing of raw data can also be determined according to different data sources. For example, according to the data source, raw data can be classified as domestic user data, overseas user data or engineering technical data. Different types of data correspond to different formats, and different methods can be applied to generate corresponding unified formatted data.
  • the same type of data may be provided in different formats, which increases the requirements for technical processing. Therefore, a uniform format can be specified.
  • the format of the original data can be converted to the specified format under this type through format conversion to obtain uniformly formatted data.
  • MQ data in different formats may be parsed so as to analyze content in messages encapsulated in different formats.
  • offline aggregation data and TOS data in response to different requests for invoking these data from file systems or data systems in different formats, they can be converted into file invocation requests implemented through a unified API.
  • service invocation requests generated under different protocols can be converted into service invocation requests in a unified protocol.
  • the DES 1040 determines from the uniformly formatted data that the data exchange constraints are satisfied.
  • the DES center 3065 in the DES 1040 especially the DES center 3065 corresponding to the data type, can perform a check on whether the data exchange constraints are satisfied or not.
  • DES Center 3065 does not need to apply various technologies to parse the original data, so it is more convenient to use rules to perform data security and compliance checks.
  • the DES 1040 converts the uniformly formatted data into raw data. Data is allowed to be synchronized between platforms subject to the data exchange constraints being met. In order to ensure that the data is correctly synchronized, the DES 1040 will further process the uniformly formatted data generated in the middle (i.e., perform a post-processing stage) to convert the uniformly formatted data into raw data, which has the original format.
  • the DES 1040 performs an exchange of raw data between the first platform and the second platform. As a result, data exchange can be achieved while satisfying security and compliance.
  • each data channel may comprise pre-processing components suitable for processing that type of raw data, post-processing components and validation components on data exchange constraints. Additionally or alternatively, each data channel may be registered with data exchange constraints to be applied to that particular type of raw data. In this way, preprocessing of different types of data, validation of data exchange constraints, and separation of postprocessing can be achieved.
  • Data channels corresponding to different types of data can be flexibly created, updated and deleted. In this way, if the pre-processing and post-processing methods of the data change, or the data exchange constraints for a specific type of data need to be updated, they can be executed in the corresponding data channel without affecting other data channels.
  • the first platform and the second platform can be flexibly exchanged between the first platform and the second platform.
  • a new data channel is created between the second platforms for processing new types of raw data.
  • Figure 3E shows a flow diagram of an example data flow 3005 for various types of data processing implemented at the DES 1040, according to some embodiments of the present disclosure.
  • the data flow 3005 relates to the data flow of the control plane and the data flow of the data plane.
  • one or more types of data channels can be configured in the DES 1040 by the operation and maintenance personnel, and updates and maintenance of the channels can be realized.
  • domestic operation and maintenance personnel can request configuration of a specific data type and a channel for processing a specific data type via the DES entry 3062, and register a data directory 3081 indicating a specific data type and a data definition 3082 similar to specific data.
  • the data definition 3082 may specify channel information for processing different types of data in the DES 1040, and may include preprocessing schemes, postprocessing schemes, etc. for corresponding types of data.
  • overseas operation and maintenance personnel can also request configuration of specific data types and channels for processing specific data types via the DES entry 3072 .
  • Overseas operation and maintenance personnel can also register a data directory 3084 indicating a specific data type and a data definition 3085 similar to specific data in the DES registration center 3066 .
  • the data definition 3085 can specify channel information for processing different types of data in the DES 1040, and can include preprocessing schemes, postprocessing schemes, etc. for corresponding types of data.
  • service invocation requests are exchanged between the client or server 3086 on the TTP IDC side and the client or server 3090 on the overseas IDC side.
  • the service call request is processed in the service call channel in DES 1040.
  • the service invocation channel may at least include a preprocessing module 3087 in the DES proxy 3063 , an HTTP proxy 3088 in the DES center 3065 , and a routing module 3089 in the DES proxy 3073 .
  • the service call request from the client or server 3086 on the TTP IDC side is sent to the preprocessing module 3087.
  • the preprocessing module 3087 uses the data preprocessing scheme specified in the data definition 3082 to process the service invocation request, and sends the uniformly formatted service invocation request to the HTTP proxy 3088 .
  • the HTTP proxy 3088 may provide the uniformly formatted service call request to the other client or server 3090 through the routing module 3089 after determining that the uniformly formatted service call request satisfies the data exchange constraints. Before being provided to the client or server 3090, the uniformly formatted service invocation request is converted back to the service invocation request conforming to the original protocol.
  • the MQ channel may at least include a preprocessing module 3092 in the DES proxy 3063 , an MQ transmitter 3094 in the DES center 3065 , and a routing module 3097 in the DES proxy 3073 .
  • the raw data 3091 for the MQ type is passed to the preprocessing module 3092 .
  • the preprocessing module 3092 processes the original data 3091 using the data preprocessing scheme specified in the data definition 3082 to obtain unified formatted data 3093 .
  • Uniformly formatted data 3093 is extracted by MQ transport 3094, for example via a third-party software development kit (SDK). After the inspection of the data exchange constraints, the SDK will push the uniformly formatted data 3096 that meets the rules to the overseas IDC. Uniformly formatted data 3095 that does not satisfy data exchange constraints is rejected.
  • the routing module 3097 routes the uniformly formatted data 3096 satisfying the rules to the corresponding destination, and the uniformly formatted data 3093 is converted back to the corresponding original data 3098 before being transmitted to the destination.
  • HDFS channel and TOS channel may include the illustrated components.
  • HDFS channel or TOS channel may include at least preprocessing module 3100 in DES proxy 3063 , file transmitter 3103 in DES center 3065 , and routing module 3105 in DES proxy 3073 .
  • the preprocessing module 3100 may initiate a request to the file transfer manager 3102 to call the file transfer API to obtain data for the offline aggregated data type or TOS type
  • the raw data 3099 is sent to the preprocessing module 3100.
  • the preprocessing module 3100 can process the raw data 3099 using the data preprocessing scheme specified in the data definition 3082 to obtain the unified formatted data 3101 .
  • the uniformly formatted data 3101 is extracted by the file transmitter 3103, for example via the SDK.
  • the SDK will push the uniformly formatted data 3104 that meets the rules to the overseas IDC.
  • Uniformly formatted data that does not meet the data exchange constraints will be rejected and cannot be sent to overseas IDCs.
  • the routing module 3105 routes the uniformly formatted data 3104 satisfying the rules to the corresponding destination, and the uniformly formatted data 3094 is converted back to the corresponding original data 3106 before being transmitted to the destination.
  • FIG. 3E only shows the processing of the outgoing data from the TTP IDC to the overseas IDC in the DES 1040.
  • DES 1040 can also be processed through a similar process, and DES 1040 can also reserve corresponding components to support corresponding processing, especially the components in the DES adapter.
  • FIG. 3F shows a schematic block diagram of a data exchange architecture 3006 involving MQ channels according to some embodiments of the present disclosure.
  • Data exchange framework 3006 may be implemented in DES 1040 for performing data security protection for MQ type data.
  • FIG. 3F data exchange from the TTP IDC to the overseas IDC direction is shown.
  • the source database 3110 in the TTP IDC generates the entities of the MQ data to be transmitted.
  • MQ data may include messages such as change data or business custom events, and different messages may have different formats.
  • the MQ data generated by the source database 3110 is put into the source message queue 3112 .
  • the DES adapter 3061 in addition to the DES entry 3062 , also includes a DES pre-adapter 3120 .
  • the pre-DES adapter 3120 can be implemented as a part of the DES proxy 3063, so as to preprocess the MQ data from the TTP IDC to the overseas IDC direction.
  • Pre-DES adapter 3120 may be configured to process MQ data in different formats into uniformly formatted MQ data having a uniform format, and provide the uniformly formatted MQ data to MQ transport 3094 to perform determinations regarding whether data exchange constraints are satisfied .
  • MQ data may also include data generated by different protocols, and the data under each protocol has a custom format and therefore requires different preprocessing.
  • the pre-DES adapter 3120 may include a parser 3122 configured to parse different types of raw MQ data, so as to convert the different types of raw MQ data into unified formatted MQ data in a unified format.
  • the DES pre-adapter 3120 may include a MySQL parser for parsing data generated by the MySQL protocol, such as change data capture (CDC) data; a Redis parser for parsing data generated by the Redis protocol, such as CDC data; document parser, used to parse the data in the document database, especially CDC data; graph parsing data, used to parse the data of the graph (graph) database, especially CDC data; MQ parser, used to parse the passing message Different types of business event data sent by the queue, etc. It can be understood that the parser 3122 can be flexibly scaled, wherein more, less or other parsers can be set to parse corresponding types of MQ data.
  • MySQL parser for parsing data generated by the MySQL protocol, such as change data capture (CDC) data
  • CDC change data capture
  • Redis parser for parsing data generated by the Redis protocol
  • document parser used to parse the data in the document database, especially CDC
  • the unified formatted MQ data obtained after parsing may also be in the form of a message queue, and may be put into the unified formatted message queue 3124 .
  • the MQ transmitter 3094 in charge of MQ data can extract the parsed unified formatted MQ data from the unified formatted message queue 3124 through the SDK for data security and compliance check. Data not satisfied
  • the uniformly formatted MQ data of is rejected by the MQ transporter 3094 and recorded in the rejection log 3126.
  • the uniformly formatted MQ data meeting the data exchange constraints is pushed to the post-DES adapter 3130 in the DES adapter 3070 via the SDK.
  • the DES post-adapter 3130 may be implemented as part of the DES proxy 3073 for post-processing the uniformly formatted MQ data in the direction from the TTP IDC to the overseas IDC to deliver the data to the destination.
  • the uniformly formatted MQ data that meets the data exchange constraints is pushed to the post-DES adapter 3130 via the SDK.
  • DES post-adapter 3130 may include data replayer 3132 for performing post-processing on uniformly formatted MQ data.
  • the post-DES adapter 3130 may be configured to convert the uniformly formatted MQ data into raw MQ data.
  • the DES post-adapter 3130 may include replayers corresponding to different types of MQ data for performing conversion from the unified format to respective customized formats.
  • the DES back adapter 3130 can include a MySQL playback device for converting the unified formatted MQ data into MQ data conforming to the MySQL protocol; a Redis playback device for converting the unified formatted MQ data into conforming to the Redis protocol MQ data; document playback, used to convert uniformly formatted MQ data into graphical raw data; MQ playback, used to convert uniformly formatted data into raw data that conforms to the MQ protocol, etc.
  • the converted original MQ data is put into the uniformly formatted message queue 3134 in the post-DES adapter 3130 , and can be synchronized to the target message queue 3135 therefrom.
  • the target message queue 3135 is used to store the MQ data that is indirectly synchronized via the DES 1040 from the source message queue 3112.
  • the target database 3136 may obtain the desired MQ data from the target message queue 3135 .
  • Figure 3F only shows the components involved in the data exchange from the TTP IDC to the overseas IDC.
  • DES 1040 can include similar components for handling the data exchange of this direction, for example, the DES adapter 3070 can include the adapter 3120 with DES DES front adapter of similar function, and DES adapter 3061 may include a DES rear adapter with similar function of DES rear adapter 3130 .
  • the treatment in this direction is not expanded in detail.
  • FIG. 3G shows a schematic block diagram of a data exchange architecture 3500 involving HDFS channels according to some embodiments of the present disclosure.
  • Data exchange framework 3500 may be implemented in DES 1040 for performing data security protection for offline aggregated data.
  • the offline aggregation data exchange between HDFS 3502 on the TTP IDC side and HDFS 3504 on the overseas IDC side is shown.
  • Some offline aggregated data in HDFS 3502 and HDFS 3504 may need to be synchronized with each other.
  • the data transmission detector 3510 on the TTP IDC side is responsible for detecting whether offline aggregation data that needs to be transmitted to the HDFS 3504 on the other side is stored in the HDFS 3502.
  • the data transfer submitter 3520 may submit a request for data transfer to the file transferer 3550 .
  • the data preprocessing module 3530 is configured to perform preprocessing on the data, so as to process the offline aggregated data into uniformly formatted data.
  • data transfer server 3556 is configured to control data transfer services based on utilizing data exchange constraints. If the data transmission server 3556 determines that the preprocessed unified formatted data from HDFS 3502 conforms to the data exchange constraints, then the transmission task 3558 can be called to transmit the unified formatted data to the overseas IDC through the transmission task 3562 under the transmission task 3558.
  • the transfer job 3558 can also optionally include a data validation task 3560, which can be configured to perform data validation as needed.
  • the uniformly formatted data passes through the HDFS gateway 3564 and can be post-processed to obtain the original offline aggregated data, which is then stored in the HDFS 3504.
  • the data transmission detector 3570 on the side of the overseas IDC is responsible for detecting whether the offline aggregation data of the HDFS 3502 that needs to be transmitted to the TTP IDC side is stored in the HDFS 3504.
  • the data transfer submitter 3572 may submit a request for data transfer to the file transferer 3550 .
  • the data preprocessing module 3570 is configured to perform preprocessing on the data, so as to process the offline aggregated data into uniformly formatted data.
  • transfer job 3554 may be invoked to transfer the uniformly formatted data via transfer under transfer job 3554 Task 3552 is sent to the TTP IDC. After the uniformly formatted data is post-processed, the original offline aggregated data is obtained and stored in HDFS 3502.
  • a TOS channel can determine whether an object file satisfies data exchange constraints, and if the constraints are met, copy the object file from a source IDC (e.g., TTP IDC or foreign IDC) to a destination IDC (e.g., foreign IDC or TTP IDC).
  • Object files are, for example, video, audio, image, document, or other media files.
  • an object file may be copied from the object store through an API, determination of data exchange constraints may be performed, and the object file may be pushed to the object store at the destination using the API.
  • the satisfaction of the data exchange constraint is determined by the copy request corresponding to the object file. Details of the TOS channel will be described below with reference to FIGS. 3H to 3J .
  • FIG. 3H shows a schematic diagram of a target object store (TOS) channel 3600 for data replication from a TTP IDC to an offshore IDC according to some embodiments of the present disclosure.
  • the data to be exchanged is an object file, which is stored in the object store 3606 in the TTP IDC, and is expected to be exchanged to the object store 3607 of the offshore IDC.
  • the API 3605 in the TTP IDC is configured to push the replication request to the working node 3605, and receive the replication result exchanged from the overseas IDC on the other side from the working node 3605.
  • a copy request for an object file to be exchanged is transmitted by an API (also referred to as a DES-TOS API) 3602 to a worker node 3605.
  • the copy request may indicate information related to the object file to be exchanged, such as the format of the object file (video, audio, text, etc.), the identifier of the object file, and other file metadata.
  • the copy request has a uniform format.
  • Worker nodes 3605 within the trusted zone VPC2 are configured to perform determinations regarding data exchange constraints in response to copy requests for object files. Specifically, the working node 3605 may determine from the uniformly formatted copy request whether the object file to be exchanged satisfies the data exchange constraint.
  • registration of data exchange constraints can be initiated at the initial stage or when needed later.
  • the data exchange constraints to be used may be registered with the DES registry 3624 in the TTP trusted zone through the DES entry 3620 in the TTP IDC.
  • Registration of data exchange constraints can be achieved by calling API 3602.
  • the working node 3605 can access the data exchange constraints currently to be used through the DES registry 3624.
  • the data exchange constraints may indicate a whitelist of object files that are allowed to be exchanged or a blacklist of object files that are not allowed to be exchanged, in each list can be identified by object file format, identifier, etc.
  • worker nodes 3605 allow execution of replication requests that satisfy the data exchange constraints. If the copy request is allowed to be executed, the working node 3605 accesses the object store 3606 in the TTP IDC to copy the object file to the object store 3607 in the overseas IDC. For illegal requests (that is, replication requests that do not satisfy data exchange constraints), they will be rejected and thus cannot be executed.
  • the working node 3605 can write the copied object file into the object storage 3607 via the API 3610 in the overseas IDC. In this way, the data stream ends 3611.
  • FIG. 31 shows a schematic diagram of a TOS channel 3650 for data replication from an overseas IDC to a TTP IDC according to some embodiments of the present disclosure.
  • the object files to be exchanged are stored in object storage 3607 in the foreign TTP IDC, and are expected to be exchanged to object storage 3606 of the TTP IDC.
  • the API 3610 in the overseas IDC is configured to push the replication request to the working node 3605, and the working node 3605 receives the replication result exchanged from the TTP IDC on the other side.
  • a copy request for an object file to be exchanged is transmitted by API 3610 to a worker node 3605.
  • the copy request may indicate information related to the object file to be exchanged, such as the format of the object file (video, audio, text, etc.), the identifier of the object file, and other file metadata.
  • the copy request has a uniform format.
  • the working node 3605 in the trusted area VPC2 can determine whether the object file to be exchanged meets the data exchange constraint from the uniformly formatted copy request.
  • registration of data exchange constraints may be initiated at the initial stage or when needed later.
  • constraint registration starts 3632 the data exchange constraints to be used can be registered with the DES registration center 3624 in the TTP trusted zone through the DES entry 3630 in the overseas IDC.
  • the registration of data exchange constraints can be realized by calling API 3610.
  • the working node 3605 can access the currently used data exchange constraints through the DES registry 3624 .
  • worker nodes 3605 allow execution of replication requests that satisfy the data exchange constraints. If the copy request is allowed to be executed, the worker node 3605 accesses the object store 3607 in the offshore IDC to copy the object file to the object store 3606 in the TTP IDC. For illegal requests (that is, replication requests that do not satisfy data exchange constraints), they will be rejected and thus cannot be executed.
  • the worker nodes 3605 can write the copied object files to the object store 3606 via the API 3602 in the TTP IDC. Thus, the data stream ends 3652.
  • Figure 3J illustrates a message sequence 3012 in a TOS channel according to some embodiments of the present disclosure.
  • Message sequence 3012 in Figure 3J involves TTP 3701, operation and maintenance personnel 3702, platform staff 3703, DES entry 3704, API 3705, worker nodes 3605, and object storage 3708.
  • DES entry 3704, API 3705, and object store 3708 in Figure 3J may be corresponding components in either of Figures 3H and 3I.
  • the DES entry 3704 includes the DES entry 3620 shown in Figure 3H
  • the API 3705 includes the API 3602 in Figure 3H
  • the object storage 3708 includes the API 3602 shown in Figure 3H Object store 3606 in .
  • the DES entry 3704 includes the DES entry 3630 shown in Figure 3I
  • the API 3705 includes the API 3610 in Figure 3I
  • the object storage 3708 includes the object storage 3607 in Figure 3I.
  • operation and maintenance personnel 3702 register 3711 data exchange constraints with DES entry 3704 , which can restrict the replication of object files between object stores 3606 and 3607 of different IDCs.
  • the DES entry 3704 may send 3714 a response to the operation and maintenance personnel.
  • the DES entry 3704 registers 3712 the container information about the data exchange constraints with the API 3705, and the API 3705 can send 3713 a response to the DES entry 3704 after the registration is completed.
  • Rules registered via the DES entry 3704 may be cached 3715 to the API 3705, and may also be cached 3716 to the worker nodes 3605.
  • a platform worker 3703 may initiate 3717 a copy request to the API 3705 for the object file.
  • API 3705 may perform authentication 3718.
  • Worker nodes 3605 may pull 3719 copy requests from API 3705 and perform 3720 determination of data exchange constraints on object files to be copied. If object file copying is allowed, the worker node 3605 performs 3721 file copy to copy the corresponding object file from the object store 3706 . Regardless of the result of not satisfying the data exchange determination, the working node 3605 returns a 3722 feedback to the API 3705. Where copying of object files is allowed, the feedback includes the copied object files. In cases where the object file is not allowed to be copied, the feedback is used to indicate that the copy request was denied.
  • the platform worker 3703 can call back 3723 the API 3705, from which the API 3705 can return 3724 the copy request ID to the platform worker 3703.
  • the TTP 3701 can view 3725 historical object file replication through the DES entry 3704, to confirm whether the object file exchange in the past period of time meets the requirements of the data exchange constraints.
  • the DES entry 3704 may return 3726 the results to be viewed.
  • FIG. 3K shows a schematic block diagram of a data exchange architecture 3800 involving a service invocation channel according to some embodiments of the present disclosure.
  • Data exchange framework 3800 may be implemented in DES 1040 for enforcing data security protection for service call type data.
  • FIG. 3L a service invocation data exchange between a target platform service 3802 on the TTP IDC side and a foreign (non-TTP) platform service 3804 on the foreign IDC side is shown.
  • services on the target platform service 3802 may need to call services on the overseas platform service 3804
  • services on the overseas platform service 3804 may also need to call services on the target platform service 3802 .
  • Different service platforms may apply a variety of different service call protocols, such as HTTP protocol or Thrift RPC protocol.
  • HTTP protocol HyperText Transfer Protocol
  • Thrift RPC protocol Thrift RPC protocol
  • the non-TTP control plane is used for channel registration, channel architecture update, and detection; the TTP/TTP control plane is used for channel request approval, channel prohibition, and channel detection.
  • the HTTP load balancer 3810 is an L7 balancing product from TTP Cloud, which is a key component to ensure that all DES-RPC channel traffic passes through the VPC trusted area.
  • the HTTP channel is a channel that supports the HTTP protocol in the DES-RPC channel.
  • the Thrift RPC channel is a channel that supports the Thrift RPC protocol in the DES-RPC channel. Thrift RPC channels will be wrapped in HTTP channels before being sent to the HTTP load balancer for TTP.
  • Channel information can include the type of channel, such as Thrift RPC or HTTP.
  • Channel information may also include RPC call tuples.
  • the call tuple can include src dc, src service, dst dc, dst service, rpc method/http path.
  • Data definition can depend on the direction in which the data is flowing. For data flow from non-TTP to TTP, the response will be declared using Thrift IDL with compliance annotations. For data flow from TTP to non-TTP, requests will be declared using Thrift IDL with compliance annotations.
  • a DES-RPC channel is only available when the DES-RPC channel passes compliance registration.
  • FIG. 3L shows an example of data exchange from non-TTP to TTP in the service invocation channel shown in FIG. 3K according to some embodiments of the present disclosure.
  • the call initiated by service A 3901 in the overseas region will be forwarded by HTTP proxy 3902 or Thrift proxy 3903 to HTTP load balancer 3905 of TTP.
  • Service A 3901 may be an example of the overseas platform service shown in FIG. 3M.
  • HTTP requests the call will be forwarded by HTTP proxy 3902 to HTTP load balancer 3905.
  • Thrift requests the call will be forwarded by the Thrift proxy 3903 to the HTTP load balancer 3905.
  • the service discovery to the HTTP load balancer 3905 in the VPC trusted zone can be realized through DNS, and the service discovery from the HTTP load balancer 3905 to the corresponding IDC traffic agent can be discovered by other custom/generic services mechanism to achieve.
  • the HTTP load balancer 3905 of TTP forwards the request to the HTTP proxy 3907 and the Thrift proxy 3908 of the TTP respectively, and then the HTTP proxy 3907 and the Thrift proxy 3908 respectively forward the request to service B 3908 and service B as the target service C 3910.
  • the Thrift proxy 3908 will restore the original Thrift request from the generated new HTTP request before sending the request.
  • TTP's HTTP Proxy 3907 and Thrift Proxy 3908 will check the response before sending it to TTP's HTTP Load Balancer 3905. An error is returned for responses that fail the compliance check. Additionally, for Thrift rpc calls, the Thrift response will be wrapped with HTTP to generate a new HTTP response. The body of the new HTTP response is the Thrift binary.
  • FIG. 3M shows an example of data exchange from TTP to non-TTP in the service invocation channel shown in FIG. 3K according to some embodiments of the present disclosure.
  • calls initiated by TTP's Service A 3951 will be forwarded by TTP's HTTP Proxy 3952 and Thrift Proxy 3953 to TTP's HTTP Load Balancer 3955.
  • HTTP requests the call will be forwarded by the HTTP proxy 3952 to the HTTP load balancer 3955.
  • Thrift requests the call will be forwarded by the Thrift proxy 3953 to the HTTP load balancer 3955.
  • Thrift rpc calls For illegal requests, an error will be returned. An error is returned for responses that fail the compliance check.
  • Thrift rpc calls the request will be wrapped with HTTP to generate a new HTTP request. The body of the new HTTP request is the Thrift binary.
  • the HTTP load balancer 3955 of the TTP forwards the request to the HTTP proxy 3957 and the Thrift proxy 3958 of the non-TTP (that is, outside the region). HTTP proxy 3957 and Thrift proxy 3958 then forward the request to service B 3959 and service C 3960 in the outbound area.
  • the Thrift proxy will restore the original Thrift request from the generated new HTTP request before sending it.
  • Non-TTP's HTTP Proxy 3957 and Thrift Proxy 3958 will send responses to TTP's HTTP Load Balancer 3955.
  • Thrift rpc the Thrift response will be wrapped with HTTP to generate a new HTTP response.
  • the body of the new HTTP response is the Thrift binary.
  • Client applications need to communicate with the server to transfer data.
  • Client app network traffic can transfer large amounts of user data. Therefore, there is a need for a method capable of managing the network traffic of the client application, so that user data will not be transmitted to an unauthorized server via the network traffic of the client application. For example, in the scenario of data sovereignty protection, this method can prevent user data from being transmitted to servers in non-data sovereign countries.
  • Client applications may include mobile applications and computer (PC) applications.
  • the network traffic of the client application may include native network traffic, web view network traffic, and the like.
  • not all network traffic of client applications is under the management and control of the application's owner.
  • web traffic for client applications may include web traffic from third-party advertisers. Therefore, it is very difficult to manage various types of network traffic of client applications.
  • Example embodiments of the present disclosure propose a method of managing network traffic of client applications.
  • the method includes: based on the determination of the target user, detecting the network transmission of the user data of the target user from the client application to the server; based on the type of network traffic corresponding to the network transmission, analyzing the network traffic at different layers of the network transmission; and based on the analysis Indicate that the network traffic satisfies the data exchange constraints corresponding to the target user, and send the network traffic to the server defined by the data exchange constraints.
  • FIG. 4A shows a flowchart of an example method 4100 of managing network traffic of mobile applications according to some embodiments of the present disclosure.
  • the method 4100 may be implemented, for example, at the security sandbox system 1090 of FIG. 1 .
  • the mobile terminal application may be the target application 1080 of the mobile terminal.
  • a network transmission of user data of the target user from the target application 1080 to the server is detected.
  • the security sandbox system 1090 may detect network transmission of user data of the target user.
  • network traffic can be routed to security sandbox system 1090 based on the determination of the target user, such that security sandbox system 1090 can detect and analyze network traffic corresponding to network transmissions of user data.
  • the security sandbox system 1090 can analyze the network requests of the target application 1080 and restrict the network requests that do not meet the conditions based on the data exchange constraints.
  • Data exchange constraints may include exchange constraints related to data sovereignty, such as data sovereignty protection rules.
  • Data sovereignty protection rules can be determined according to the regulations of each country or region.
  • Data sovereignty protection rules may also be determined by the operator of the application (for example, related to the user data usage agreement).
  • Data sovereignty protection rules can be set based on specific scenarios. For example, data sovereignty protection rules may stipulate that user data in the data sovereign country is not allowed to be transmitted to any server outside the data sovereign country. In other implementations, the data sovereignty protection rules may stipulate that private user data of the data sovereign state is not allowed to be transmitted to any unregistered server. The scope of the present disclosure is not limited in this regard.
  • the network request of the target application 1080 is analyzed and processed by the security sandbox system 1090 and then transmitted to the application firewall subsystem 1020 .
  • the principles and details of the security sandbox system 1090 will be described in detail below.
  • Target users are users whose transmission of user data needs to be detected and managed.
  • the target user may be a user with the nationality of the data sovereign country.
  • the target user may also be a user determined according to specific rules for data sovereignty protection.
  • the target user may be a user who has the nationality of the data sovereign country and is currently geographically located in the data sovereign country.
  • the target user can be determined based on user information.
  • User information may include user account information, personal information, registration information, and the like.
  • target users may be determined based on device information.
  • the device information may include Subscriber Identity Module (Subscriber Identity Module, SIM) information, IP address, network service provider information, device system setting information, application setting information, and the like.
  • SIM Subscriber Identity Module
  • the target user can be determined based on a combination of various information.
  • Various messages can have different priorities. For example, the priority of SIM information and network service provider information may be higher than IP address, system setting information, application setting information and the like.
  • the determination of the target user may be based on the determination of the region where the target user is located.
  • the region where the target user is located may be determined by using the above user information or device information, thereby determining the target user.
  • the region setting in the system settings of the smart phone can be used to determine the region where the current user is located, and thereby determine whether the current user is a target user.
  • the country code of the SIM card can be used to determine the region where the target user is located, and the target user can be determined accordingly.
  • the target user can be determined upon initial launch of the application. In other words, it can be determined whether the current user is the target user when the application is initially launched. Alternatively or additionally, whether the current user is the target user can be determined when the user registers. Alternatively or additionally, it may be determined whether the current user is the target user when the user logs in, logs out, or switches accounts.
  • the determination result may be stored locally or in a server. It is possible to store determination results after determining a user as a target user for the first time and set to use the stored determination results within a threshold period of time. In this way, when the user logs in again, there is no need to determine the user again.
  • network traffic is analyzed at different layers of the network transport based on the type of network traffic to which the network transport corresponds.
  • the network traffic in the target application 1080 may include multiple types of network traffic, such as native (native), web view (Webview) and third-party software development kit (SDK) type of network traffic.
  • Native types of network traffic are generated and processed by operating system (eg, Android and IOS) code in the business layer.
  • Operating system e.g, Android and IOS
  • Native types of network traffic can be completely controlled by the owner of the target application 1080 .
  • the network traffic of the third-party SDK type is generated and processed by the third-party SDK.
  • the target application 1080 can access a third-party SDK to implement the functions of login or sharing.
  • the network traffic of the third-party SDK type is generated and processed by these third-party SDKs. It should be understood that third-party SDK-type network traffic is generally not completely under the control of the owner of the application.
  • the network traffic of the web view type may include network traffic controlled by the owner of the application, for example, the network traffic generated by the built-in browser of the application by calling the code of the native application.
  • Web view-type web traffic may also include web traffic controlled by third parties. For example, web traffic generated and controlled by third-party advertisers.
  • the security sandbox system 1090 can adopt a corresponding analysis strategy, so as to better manage the network transmission of user data in the application.
  • the network traffic is sent to the server defined by the data exchange constraints.
  • Different data exchange constraints can be set for different target users. For example, for target users with higher sensitivity levels, stricter data exchange constraints can be set.
  • Data exchange constraints may define which user data may be transferred to which servers.
  • the data exchange constraints corresponding to the target user may be determined based on user information or corresponding device information of the target user.
  • the security sandbox system 1090 may include multiple sub-modules for different types of network traffic. For example, a submodule for managing native network traffic, a submodule for managing webpage view network traffic, and a submodule for managing third-party SDK network traffic. These sub-modules can analyze the corresponding types of network traffic, as well as limit or intercept network traffic that does not meet the data exchange constraints. Details of management for different types of network traffic will be described in detail below with reference to FIGS. 4B to 4E .
  • FIG. 4B shows a schematic diagram of an analysis and restriction process 4200 for native type network traffic according to some embodiments of the present disclosure.
  • FIG. 4B shows a sub-module 4210 for analyzing and restricting native types of network traffic.
  • the sub-module 4210 may be a part of the security sandbox system 1090 or a specific implementation of the security sandbox system 1090 .
  • the business logic layer 4220 sends the network request to the underlying OS 4230.
  • the service logic layer 4220 may be a specific implementation of the application service logic 1100 shown in FIG. 1 in terms of network transmission.
  • the sub-module 4210 can be used as an interceptor to analyze and restrict network requests at the network layer.
  • the sub-module 4210 may restrict network requests by analyzing endpoints, parameters of network requests, or schemas. For example, whether to limit the network request can be based on whether the schema has been registered. Alternatively or additionally, whether to restrict the network request may be determined based on whether the requested field in the network request involves sensitive information.
  • the submodule 4210 may include an interceptor for Android and an interceptor for IOS. Additionally, the submodule 4210 may also include an interceptor for C++. In this manner, by analyzing and restricting the network request at the network layer, it is possible to better determine whether the network request should be restricted based on the protocol information of the network request.
  • FIG. 4C shows a schematic diagram of an analysis and restriction process 4300 for network traffic of web view type according to some embodiments of the present disclosure.
  • FIG. 4C shows a sub-module 4310 for analyzing and restricting network traffic of a web view type.
  • the sub-module 4310 may be a part of the security sandbox system 1090 or a specific implementation of the security sandbox system 1090 .
  • the sub-module 4310 may divert the web view type network traffic to a native network interface, so that the web view type network traffic may be analyzed and limited by the sub-module 4210 for native type network traffic.
  • the sub-module 4310 can utilize a JavaScript (JS) hook mechanism to transfer web view-type network traffic to a native network interface.
  • JS JavaScript
  • the submodule 4310 may include an enabler 4311 , a navigation URL interceptor 4312 and an internal request interceptor 4313 .
  • the sub-module 4310 can communicate with the browser 4320 built into the application, so that the web traffic of the web view type can be managed and detected by the sub-module 4310 .
  • the launcher 4311 can perform JS injection when the built-in browser 4320 of the application is opened (created), so that the network traffic of the webpage view type can be transferred to the original network interface by using the hook mechanism. Network traffic diverted to the native network interface can be taken over by the native network module.
  • the JS hook technology can be used to transfer network traffic in the following manner.
  • Navigation URL blocker 4312 can analyze and restrict the URL of the main page (initial page). For example, the navigation URL interceptor 4312 can determine whether to restrict the network request based on whether the schema of the URL is registered. If the network request is not restricted, the browser 4320 can load the main page.
  • the internal request interceptor 4313 can transfer the network traffic related to the static resources and dynamic resources of the main page to the original network interface, so that these network traffic can be restricted and analyzed by the sub-module 4210 at the network layer.
  • the specific analysis and restriction process is similar to the original type of network traffic, and will not be repeated here.
  • the sub-module 4310 may adopt different analysis and restriction strategies for web traffic controlled by the owner of the application and web traffic controlled by a third party. For example, for the network traffic of the web page view type controlled by a third party, only the navigation URL interceptor 4312 can be used to determine whether the URL of the main page is registered to analyze the relevant network traffic, without further analyzing the static resources and dynamics of the main page resource.
  • FIG. 4D shows a schematic diagram of an analysis and restriction process 4400 for third-party SDK-type network traffic according to some embodiments of the present disclosure.
  • FIG. 4D shows a sub-module 4410 for analyzing and limiting network traffic of the third party SDK type.
  • the sub-module 4410 may be a part of the security sandbox system 1090 or a specific implementation of the security sandbox system 1090 .
  • the sub-module 4410 can analyze and limit the network traffic of the third-party SDK type at the application program interface (API) layer.
  • the sub-module 4410 can limit the network traffic of the third-party SDK type by analyzing whether the data requested by the API of the third-party SDK meets the data exchange constraints at the API layer.
  • the sub-module 4410 can wrap (wrap) the API requesting user data in the third-party SDK, and add judgment logic based on data exchange constraints in the wrapping. In other words, the sub-module 4410 can determine the package API by adding judgment logic to the API of the third-party SDK. In this way, the business logic layer 4220 does not directly call the API of the third-party SDK, but calls the package API added with judgment logic.
  • the submodule 4410 may include wrapping modules for each third-party SDK.
  • package module 4412 for SDK 4411 package module 4414 for SDK 4413, and package module 4416 for SDK 4415.
  • the package module (for example, package module 4412) can package the API in the corresponding SDK (for example, SDK 4411), so as to generate the corresponding package API.
  • the sub-module 4410 can dynamically add wrapping modules to wrap APIs of third-party SDKs.
  • APIs of third-party SDKs can be wrapped in the following ways.
  • the package module 4412 can define the same API as the API in the SDK 4411 exposed to the business layer.
  • the package module 4412 can realize the API and define the package class of the SDK 4411 data type.
  • the judgment logic can determine whether the API of the wrapped third-party SDK can be called based on the data exchange constraints. In some implementations, the judgment logic may analyze whether the API of the third-party SDK can be called based on the name of the SDK, the name of the API, the name of the parameter of the API, and the like. If the judgment result is yes, the API of the third-party SDK can be called and return a value to the business layer. If the judgment result is no, the API of the third-party SDK is not called, that is, the network traffic related to the API is restricted. It should be understood that the judgment logic may vary based on specific scenarios. For example, the judgment logic can be set to not allow the user's private data to be transmitted to the third-party SDK.
  • the sub-module 4410 can manage and detect third-party SDK-type network traffic without knowing the internal code of the third-party SDK.
  • FIG. 4E shows a block diagram of a security sandbox system 1090 according to some embodiments of the present disclosure.
  • the security sandbox system 1090 includes a launch module 4520 .
  • the initiating module 4520 is configured to initiate detection of network transmission of user data of the target user from the client application to the server based on the determination of the target user.
  • the activation module 4520 may activate the management module to detect, manage, analyze and restrict network traffic corresponding to network transmission of user data.
  • the management module is configured to analyze the network traffic at different layers of the network traffic based on the type of network traffic corresponding to the network traffic; Limited servers.
  • the management module may include a submodule (also referred to as a first management module) 4210 , a submodule (also referred to as a second management module) 4310 and a submodule (also referred to as a third management module) 4410 .
  • Sub-module 4210, sub-module 4310, and sub-module 4410 may analyze and limit network traffic of client applications.
  • the sub-module 4210 is configured to analyze the network traffic at the network layer based on the type of the network traffic being the native type of network traffic.
  • the submodule 4310 is configured to transfer the network traffic of the web view type to the network interface of the client application to be managed by the native network module of the client application based on the type of the network traffic being web view type network traffic; As well as analyzing the diverted network traffic at the network layer.
  • diverting the network traffic of the webpage view type to the network interface of the mobile application includes: utilizing a JavaScript hook mechanism to divert the network traffic of the webpage view type.
  • the submodule 4410 is configured to analyze the network traffic at the application program interface API layer based on the type of network traffic being the third-party SDK type of network traffic.
  • analyzing the network traffic at the API layer includes: determining the package API by adding judgment logic based on the data exchange constraints to the API of the third-party SDK; and calling the package API to use the judgment logic to Analyze the network traffic.
  • the activation module 4520 can activate the sub-modules 4210, 4310, and 4410 based on the determination of the target user. For example, the initiation module 4520 may determine whether the current user is a target user when the user registers. If the determined result is yes, the start module 4520 may activate the submodules 4210 , 4310 and 4410 . For another example, the activation module 4520 may obtain a determination result of the user from the local or server when the user logs in, and determine whether to activate the sub-modules 4210, 4310, and 4410 based on the determination result.
  • the security sandbox system 1090 may also include a sampling module 4510 for sampling network traffic.
  • the sampling module 4510 can send a sampling signal to the activation module 4520 to trigger the activation module 4520 .
  • the sampling signal may indicate a sampling rate at which network traffic is sampled.
  • the sampling module 4510 can sample target users and different types of network traffic based on data exchange constraints. For example, the sampling module 4510 can sample different types of network traffic at different sampling rates. Using the sampling module 4510, only a part of network traffic can be analyzed, thereby reducing overhead and maintaining application stability.
  • the security sandbox system 1090 may also include other modules, or only include some of the modules shown in FIG. 4E .
  • the security sandbox system 1090 may not include the submodule 4310 for network traffic of the webpage view type.
  • the scope of the present disclosure is not limited in this regard.
  • the network traffic can also be analyzed and restricted at the socket (Socket) layer.
  • Socket socket
  • the network traffic of the third-party SDK type can be transferred at the Socket layer, so that the network request of the third-party SDK type can be directly analyzed.
  • a local server acting as a proxy can also be established on the target application 1080 .
  • the network requests of the target application 1080 can be forwarded to the local server, and by analyzing and restricting the network traffic at the local server, the network requests forwarded by the local server to the external server can be managed. In this way, different types of network traffic can be analyzed and restricted taking into account protocol information, thereby better managing the application's network traffic from being transmitted to unauthorized external servers.
  • the security sandbox system 1090 can directly analyze and limit the network traffic in the target application 1080 . In other words, only network traffic not restricted by the security sandbox system 1090 can continue to flow. Alternatively or additionally, the security sandbox system 1090 may not directly restrict network traffic, but only provide analysis reports. In this case, a copy of the network request may be sent to the security sandbox system 1090 while the network request is normally transmitted. The security sandbox system 1090 can analyze a copy of the network request and provide an analysis report.
  • multiple security sandbox systems 1090 may be set accordingly to handle each data sovereign country respectively. For example, based on the determination of the region where the target user is located, the corresponding security sandbox system can be activated to analyze and restrict network traffic, so that the network transmission of user data in the application complies with the data sovereignty protection rules of the corresponding country.
  • the target application can, for example, provide the user with various content recommendations through a recommendation mechanism, such as multimedia content recommendations, user recommendations, commodity recommendations, and so on.
  • a recommendation mechanism such as multimedia content recommendations, user recommendations, commodity recommendations, and so on.
  • the fairness of recommendation strategies has become a management focus in many areas.
  • some applications may use a recommendation mechanism to guide users to pay attention to specific content that has nothing to do with user habits, and such a recommendation mechanism may be illegal.
  • FIG. 5 shows a flowchart of a process 500 of managing recommendation policies.
  • the process 500 can be performed by the recommendation management subsystem 1050, for example.
  • the recommendation management subsystem 1050 acquires a set of object features associated with a set of objects in the target application, wherein the set of object features is converted based on the attributes of a set of objects, a Group object features do not directly express properties of a group of objects.
  • the recommendation management subsystem 1050 can obtain the set of object features via an application program interface API provided by the target application. In some embodiments, the recommendation management subsystem 1050 can obtain a set of object characteristics associated with a set of objects in the target application 1080 from the target application platform 1030, eg, via a dedicated API.
  • the set of object features may, for example, be converted by a feature extraction model based on the attributes of the set of objects. Based on such a method, the manager of the recommendation strategy or other third parties cannot determine the original attribute information of the object based on the characteristics of the object. Thus, the security of data in the target application can be guaranteed.
  • the recommendation management subsystem 1050 determines a first object characteristic and a second object characteristic from a set of object characteristics, wherein a first difference between the first object characteristic and the second object characteristic is less than a first threshold.
  • the set of object features may be represented as a plurality of vectors, for example.
  • the recommendation management subsystem 1050 may, for example, select at least a pair of object features whose difference is smaller than a first threshold from the group of object features based on the difference between the vectors.
  • the recommendation management subsystem 1050 determines a first recommendation result corresponding to the first object characteristic and a second recommendation result corresponding to the second object characteristic based on the recommendation policy in the target application.
  • the recommendation management subsystem 1050 may provide a first object feature to a recommendation model associated with a recommendation strategy to determine a first recommendation result and a second object feature to a recommendation model to determine a second Recommended results.
  • the recommendation management subsystem 1050 may send the selected first object feature and the second object feature to the remote running recommendation model via the API provided by the target application, for determining the first recommendation result and the second recommendation result.
  • the recommendation model may be run by, for example, the maintainer of the target application.
  • the process of generating the first recommendation result and the second recommendation result will not affect the recommendation model actually deployed in the target application.
  • the first recommendation result and the second recommendation result may be represented by vectors output by the recommendation model. Therefore, the recommendation management subsystem 1050 will not be able to directly interpret the semantics of the first recommendation result and the second recommendation result, thereby further improving the security of data in the target application.
  • the recommendation management subsystem 1050 evaluates a recommendation strategy based on the first recommendation result and the second recommendation result.
  • the recommendation management subsystem 1050 can determine a second difference between the first recommendation result and the second recommendation result, and based on a comparison between the second difference and a second threshold, determine the fairness of the recommendation strategy.
  • the recommendation management subsystem 1050 determines that the second difference exceeds the second threshold, it may determine that the recommendation strategy has poor fairness.
  • the recommendation management subsystem 1050 may also determine the fairness of the recommendation strategy based on the proportion of object feature pairs whose second difference exceeds the second threshold, for example. For example, the recommendation management subsystem 1050 may, for example, randomly sample multiple groups of object feature pairs, and if the proportion of object feature pairs whose second difference exceeds a second threshold exceeds the threshold proportion, then it may be determined that the recommendation strategy has poor fairness .
  • the recommendation management subsystem 1050 can also determine the fairness of the recommendation strategy based on the correlation between the object characteristics used for input into the recommendation model and the historical recommendation results. Specifically, the recommendation management subsystem 1050 may also acquire the third object feature and historical recommendation results for the third object feature from the target application. Further, the recommendation management subsystem 1050 determines the fairness of the recommendation strategy based on the correlation between the third object feature and the historical recommendation results. For example, the recommendation management subsystem 1050 may be based on determining whether the characteristics of the object match the category information of the historical recommendation results.
  • the recommendation management subsystem 1050 may determine a vector representation corresponding to the third object feature and the historical recommendation result, and determine the correlation between the third object feature and the historical recommendation result based on the difference between the two vector representations sex. For example, if the vector difference between an object and its historical recommendation results is greater than a threshold, the recommendation management subsystem 1050 may determine that the recommendation strategy has poor fairness.
  • an inspection may also be performed by the secure computing subsystem 1060, eg, for source code associated with a recommended policy.
  • the secure computing subsystem 1060 may, for example, obtain source codes corresponding to the recommended policies, and evaluate the recommended policies based on the source codes or intermediate codes corresponding to the source codes.
  • the recommendation policy may be used to recommend at least one piece of multimedia content to the user in the target application 1080 , for example.
  • multimedia content may include, for example, images, videos, music, or combinations thereof, and the like.
  • FIG. 6 shows a schematic structural block diagram of an apparatus 600 for managing recommendation strategies according to some embodiments of the present disclosure.
  • the apparatus 600 includes: an acquisition module 610 configured to acquire a set of object features associated with a set of objects in the target application, the set of object features is converted based on the attributes of a set of objects, A set of object characteristics does not directly express the properties of a set of objects.
  • the apparatus 600 also includes a selection module 620 configured to determine a first object characteristic and a second object characteristic from a set of object characteristics, a first difference between the first object characteristic and the second object characteristic is less than a first threshold.
  • the apparatus 600 also includes a determination module 630 configured to determine a first recommendation result corresponding to the first object feature and a second recommendation result corresponding to the second object feature based on the recommendation policy in the target application.
  • the apparatus 600 further includes an evaluation module 640 configured to evaluate a recommendation strategy based on the first recommendation result and the second recommendation result.
  • the acquisition module 610 is further configured to: acquire a set of object characteristics via an application program interface API provided by the target application.
  • the determination module 630 is further configured to: provide the first object feature to the recommendation model associated with the recommendation strategy to determine the first recommendation result; and provide the second object feature to the recommendation model to determine The second recommended result.
  • the recommendation model is run by the maintainer of the target application.
  • the first recommendation result and the second recommendation result are represented by vectors output by the recommendation model.
  • the evaluation module 640 is further configured to: determine a second difference between the first recommendation result and the second recommendation result; and determine the fairness of the recommendation strategy based on the comparison between the second difference and the second threshold sex.
  • the apparatus 600 further includes: a history acquisition module configured to acquire the third object feature and the historical recommendation result for the third object feature from the target application, wherein the historical recommendation result includes a vector representation generated by the recommendation model; and a comparison module configured to determine the fairness of the recommendation strategy based on the third object characteristics and historical recommendation results.
  • the recommendation policy is used to recommend at least one piece of multimedia content to the user in the target application.
  • the apparatus 600 further includes: a code obtaining module configured to obtain source code corresponding to the recommended policy; and a code evaluating module configured to evaluate the recommended policy based on the source code or the intermediate code corresponding to the source code Strategy.
  • Fig. 7 shows a schematic block diagram of an example device 700 that may be used to implement embodiments of the present disclosure.
  • system 100 and/or system 400 may be implemented by device 700 .
  • the device 700 includes a central processing unit (CPU) 701 that can be programmed according to computer program instructions stored in a read-only memory (ROM) 702 or loaded from a storage unit 708 into a random-access memory (RAM) 703 program instructions to perform various appropriate actions and processes.
  • ROM read-only memory
  • RAM random-access memory
  • various programs and data necessary for the operation of the device 700 can also be stored.
  • the CPU 701, ROM 702, and RAM 703 are connected to each other via a bus 704.
  • An input/output (I/O) interface 705 is also connected to the bus 704 .
  • I/O input/output
  • the I/O interface 705 includes: an input unit 706, such as a keyboard, a mouse, etc.; an output unit 707, such as various types of displays, speakers, etc.; a storage unit 708, such as a magnetic disk, an optical disk, etc. ; and a communication unit 709, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 709 allows the device 700 to exchange information/data with other devices over a computer network such as the Internet and/or various telecommunication networks.
  • process 500 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708 .
  • part or all of the computer program may be loaded and/or installed on the device 700 via the ROM 702 and/or the communication unit 709.
  • a computer program is loaded into RAM 703 and executed by CPU 701, one or more actions of process 500 described above may be performed.
  • the present disclosure may be a method, apparatus, system and/or computer program product.
  • a computer program product may include a computer-readable storage medium having computer-readable program instructions thereon for carrying out various aspects of the present disclosure.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processing unit of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Storage Device Security (AREA)

Abstract

根据本公开的实施例,提供了一种管理推荐策略的方法、装置、电子设备、存储介质和程序产品。在此描述的方法包括:获取与目标应用中的一组对象相关联的一组对象特征,一组对象特征是基于一组对象的属性而转换得到的,一组对象特征不直接表达一组对象的属性;从一组对象特征中确定第一对象特征和第二对象特征,第一对象特征和第二对象特征的差异小于第一阈值;基于目标应用中的推荐策略,确定与第一对象特征对应的第一推荐结果和与第二对象特征对应的第二推荐结果;以及基于第一推荐结果和第二推荐结果,评估推荐策略。基于这样的方式,本公开的实施例可以有效地支持对应用中的推荐策略的管理,且不暴露应用中的原始数据。

Description

管理推荐策略的方法和装置
本申请要求于2021年10月27日提交中国国家知识产权局、申请号为202111256520.5、申请名称为“管理推荐策略的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开的各实现方式涉及计算机领域,更具体地,涉及管理推荐策略的方法、装置、设备和计算机存储介质。
背景技术
随着互联网技术的发展,各式各样的互联网应用已经成为人们生活中的重要部分。这样的应用每天将产生海量的数据,由此带来了各方面的诸如数据主权保护等数据安全问题。例如,一些国家可能禁止特定类型的用户数据被发送到海外的服务器。
对于一些全球化应用而言,这样的挑战是更为显著的。这样的全球化应用可能需要基于相同的技术架构来为多个不同区域的用户提供服务。然而,这些区域可能具有完全不同的数据安全约束,例如特定的数据主权保护要求,这导致数据安全保护的难度进一步加大。
发明内容
在本公开的第一方面,提供了一种管理推荐策略的方法。该方法包括:获取与目标应用中的一组对象相关联的一组对象特征,一组对象特征是基于一组对象的属性而转换得到的,一组对象特征不直接表达一组对象的属性;从一组对象特征中确定第一对象特征和第二对象特征,第一对象特征和第二对象特征之间的第一差异小于第一阈值;基于目标应用中的推荐策略,确定与第一对象特征对应的第一推荐结果和与第二对象特征对应的第二推荐结果;以及基于第一推荐结果和第二推荐结果,评估推荐策略。
在本公开的第二方面中,提供了一种用于管理推荐策略的装置。该装置包括:获取模块,被配置为获取与目标应用中的一组对象相关联的一组对象特征,一组对象特征是基于一组对象的属性而转换得到的,一组对象特征不直接表达一组对象的属性;选择模块,被配置为从一组对象特征中确定第一对象特征和第二对象特征,第一对象特征和第二对象特征之间的第一差异小于第一阈值;确定模块,被配置为基于目标应用中的推荐策略,确定与第一对象特征对应的第一推荐结果和与第二对象特征对应的第二推荐结果;以及评估模块,被配置为基于第一推荐结果和第二推荐结果,评估推荐策略。
在本公开的第三方面,提供了一种电子设备,包括:存储器和处理器;其中存储器用于存储一条或多条计算机指令,其中一条或多条计算机指令被处理器执行以实现根据本公开的第一方面的方法。
在本公开的第四方面,提供了一种计算机可读存储介质,其上存储有一条或多条计算机指令,其中一条或多条计算机指令被处理器执行实现根据本公开的第一方面的方法。
在本公开的第五方面,提供了一种计算机程序产品,其包括一条或多条计算机指令,其中一条或多条计算机指令被处理器执行实现根据本公开的第一方面的方法。
附图说明
结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标注表示相同或相似的元素,其中:
图1示出了根据本公开实施例的数据安全保护系统的示意性框图;
图2示出了根据本公开的一些实施例的计算安全子系统的示意性框图;
图3A示出了根据本公开的一些实施例的在其中部署数据交换子系统的示例部署环境;
图3B示出了根据本公开的一些实施例的在TTP方的内部数据中心(IDC)和非TTP所处的境外内部数据中心(RoW IDC)中DES的实现;
图3C示出了根据本公开的一些实施例的DE的示例架构的框图;
图3D示出了根据本公开的一些实施例的数据交换过程的流程图;
图3E示出了根据本公开的一些实施例的在DES处实现的各类数据处理的示例数据流的流程图;
图3F示出了根据本公开的一些实施例的涉及MQ通道的数据交换架构的示意框图;
图3G示出了根据本公开的一些实施例的涉及HDFS通道的数据交换架构的示意框图;
图3H示出了根据本公开的一些实施例的数据从TTP IDC复制到境外IDC的目标对象存储(TOS)通道的示意图;
图3I示出了根据本公开的一些实施例的数据从境外IDC复制到TTP IDC的TOS通道的示意图;
图3J示出了根据本公开的一些实施例的在TOS通道中的消息序列图;
图3K示出了根据本公开的一些实施例的涉及服务调用通道的数据交换架构的示意框图;
图3L示出了根据本公开的一些实施例的在服务调用通道中从非TTP到TTP的数据交换示例;
图3M示出了根据本公开的一些实施例的在服务调用通道中从TTP到非TTP的数据交换示例;
图4A示出了根据本公开的一些实施例的管理移动端应用的网络流量的方法的流程图;
图4B示出了根据本公开的一些实施例的针对原生类型的网络流量的分析和限制过程的示意图;
图4C示出了根据本公开一些实施例的针对网页视图类型的网络流量的分析和限制过程的示意图;
图4D示出了根据本公开一些实施例的针对第三方SDK类型的网络流量的分析和限制过程的示意图;
图4E示出了根据本公开的一些实施例的安全沙盒子系统的模块图;
图5示出了根据本公开的一些实施例的管理推荐策略的示例过程的流程图;
图6示出了根据本公开的一些实施例的用于管理推荐策略的装置的示例框图;以及
图7示出了可以用来实施本公开的实施例的示例设备的框图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开 的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
在本公开的实施例的描述中,术语“包括”及其类似用语应当理解为开放性包含,即“包括但不限于”。术语“基于”应当理解为“至少部分地基于”。术语“一个实施例”或“该实施例”应当理解为“至少一个实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。下文还可能包括其他明确的和隐含的定义。
以下参考附图来说明本公开的基本原理和若干示例实现。
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
在本公开的实施例的描述中,术语“包括”及其类似用语应当理解为开放性包含,即“包括但不限于”。术语“基于”应当理解为“至少部分地基于”。术语“一个实施例”或“该实施例”应当理解为“至少一个实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。下文还可能包括其他明确的和隐含的定义。
以下参考附图来说明本公开的基本原理和若干示例实现。
数据安全保护系统整体架构
根据本公开的实施例,提供了一种数据安全保护系统。图1示出了根据本公开实施例的数据安全保护系统1000的示意性框图。如图1所示,数据安全保护系统1000包括多个子系统,以用于从不同维度来保护用户在使用目标应用的过程产生的相关数据的安全。
通常而言,为了支持目标应用的运行,一方面,用户需要例如可以在通过适当的电子设备来运行目标应用1080。另一方面,还需要在适当的计算环境(例如,云计算环境)中部署目标应用平台1030,以例如运行用于支持目标应用1080的正常运行的各种类型的服务。
在一些实施例中,数据安全保护系统1000可以首先从运行代码的安全性的角度来保证目标应用1090的在运行过程中所产生的数据的安全。如图1所示,数据安全保护系统1000可以包括安全计算子系统1060,其可以用于保证与目标应用1080对应的代码的安全性,以及保证与目标应用平台1030所对应的代码的安全性。
经计算子系统1060编译获得的服务运行文件例如可以被部署到目标应用平台1030中,经计算子系统1060编译获得的目标应用的安装文件(例如,apk文件)例如可以被发布至应用商城1120。关于安全计算子系统1060的具体实现将在下文结合图2详细讨论。
在一些实施例中,如图1所示,安全计算子系统1060可以是基于云基础设施1070。在一些实施例中,云基础设施1070例如可以是由受信合作伙伴所提供。在本公开中,“受信合作伙伴”也可以称为受信技术合作方(Trusted Technology Partner,TTP),其例如可以包括在特定区域(例如,特定国家或法域)内技术上受信的任何个人、企业或组织。
在一些实施例中,如图1所示,数据安全保护系统1000可以包括由TTP所提供的受信安全环境1010。与传统的应用平台部署不同,目标应用平台1030可以被被部署在受信安全环境1010中,以提高目标应用平台1030所产生数据的安全性,以及其运行机制的透明性和可信程度。
在一些实施例中,目标应用1080可以通过推荐算法来为用户提供内容推荐服务。这样的 内容推荐例如可以包括但不限于:多媒体内容推荐、用户推荐、商品推荐等等。考虑到目前越来越多的推荐系统利用机器学习来实现推荐功能,这使得仅从代码程度来管理推荐机制可能难以保证推荐的公平性。
如图1所示,数据安全保护系统1000还可以包括推荐管理子系统1050,其例如可以通过对目标应用平台1030所运行的推荐算法进行测试,以确保目标应用1080中的推荐机制的公平性。关于推荐管理子系统1050的具体实现,将在下文详细描述。
在一些实施例中,考虑到目标应用平台1030在运行服务以支持目标应用1080的正常运行时,目标应用平台1030可能需要与其当前部署的目标区域(例如,特定国家或法域)外的应用或数据中心(也称为境外应用或境外数据中心)进行交互。
通常而言,目标区域通常会通过法律或规章而对本区域内产生的数据与境外的通信进行约束。目标区域内产生的特定类型的数据可能被禁止被传递到境外。为了保证目标应用平台1030在与境外通信过程的合规性,数据安全保护子系统可以包括数据交换子系统1040。类似地,数据交换子系统1040可以被部署在受信安全环境1010中,以保证其运行的透明性和可信程度。
在一些实施例中,如图1所示,数据交换子系统1040可以包括多个数据通道,以用于不同类型的数据传输。例如,目标应用平台1030中产生的多媒体数据例如可以通过数据交换子系统1040中的相应数据通道,并经由第三方提供的内容分发网络1130来与海外应用1140和/或海外数据中心1150通信。
作为另一示例,对于一些目标应用平台1030中产生的特定内部数据,其可以通过相应的数据通道并例如可以通过直连光缆来与海外数据中心1150和海外开发部门1160通信。关于数据交换子系统1040的具体实现将在下文结合图3A至图3M详细描述。
进一步地,为了保证目标应用平台1030出口和入口通信的安全性,在一些实施例中,数据安全子系统1000还可以包括应用防火墙子系统1020。应用防火墙子系统1020例如可以被部署在受信安全环境1010中,其例如可以用于监控从目标应用1080到目标应用平台1030的数据通信,从目标应用平台1030到目标应用1080的数据通信,和/或从目标应用平台1030到第三方应用110的数据通信等。
以此方式,数据安全保护平台1000不仅可以通过数据交换子系统1040来保证目标应用平台1030与境外的数据通信的安全性和合规性,还能够通过应用防护墙子系统1020来保证目标应用平台1030与境内的各对象(例如,目标应用1080或第三方应用1110等)通信的安全性和合规性。
在一些实施例中,对于目标应用1080而言,为了保证其运行的合规性和可信程度,数据安全保护系统1000还可以包括例如由TTP管理的安全沙盒子系统1090,其使得目标应用1080的应用业务逻辑1100所涉及的不同类型的网络通信能够受到安全沙盒子系统1090的保护。由此方式,数据安全保护系统1000可以避免目标应用1080例如通过后门程序等方式来发起不合规的数据通信。关于安全沙盒子系统1090的详细实现将在下文结合图4A至图4E详细描述。
由此,基于本公开的数据安全保护系统1000,TTP可以在从目标应用的开发到运行的整个生命周期期间对代码安全性、数据安全性等各个方面进行管理和监控,从而保证与目标应用相关联的数据的安全性,并且保证其运行的合规性。
安全计算子系统
以下将参考图2来详细描述安全计算子系统1060。图2示出了根据本公开实施例的安全计算子系统1060的示意性框图。
如图2所示,安全计算子系统1060例如可以包括安全代码环境2010,其例如可以由TTP所提供。以下将结合提交新的开发代码2140来描述安全计算子系统1060的工作过程。
如图2所示,当开发者有需要部署的新的开发代码2140时,其例如可以通过由TTP提供的同步网关2150来向安全代码环境2010提交开发代码2140。相应地,开发代码2140将被同步至由安全代码环境2010中的代码库2160。
在一些实施例中,当开发者需要利用新的开发代码2140来进行编译时,开发者例如可以通过同步网关2150来向制品构建系统2080来发送构建请求。
备选地,当代码库2160接收到新的开发代码2140时,代码库2160也可以自动地向制品构建系统2080发送代码合并事件,以触发制品构建系统2080以启动制品(artifact,例如可执行代码)的构建过程。
当构建过程被启动后,代码拉取模块2090可以从代码库2160获取用于构建的代码文件。在一些实施例中,用于构建的代码文件例如可以是由开发者所指定的,或者是由制品构建系统2080所自动地确定的。
进一步地,编译模块2100可以对由代码拉取模块2090从代码库2160中所拉取的代码进行编译,以例如编译为中间代码。
在一些实施例中,考虑到代码编译过程中往往还会利用引入一些第三方代码。安全计算子系统1060也需要保证所引入的第三方代码的安全性。
如图2所示,安全计算子系统1060可以包括第三方独立网关2030,以用于检查并确认所需要引入的第三方库2020的安全性。应当理解,这样的第三方库例如也可能是经编译的链接库或者是源代码本身。
通过安全性检查的第三方库2020可以被添加到制品库2040中。如图2所示,在制品的构建过程中,编译模块2100还可以从制品库2040中获取编译当前制品所依赖的其他制品,例如历史已经编译生成的制品,或者基于第三方库2020生成的制品等。
进一步地,编译模块2100例如可以将从代码库2160所拉取的代码和从制品库2040所获取的依赖制品,并将其编译生成中间代码,以由安全代码扫描模块2110执行代码安全性检测。
应当理解,由TTP管理的安全代码扫描模块2110可以执行任何适当的代码扫描过程来执行安全性检查,这样的扫描规则对于开发者来说是未知的,由此可以保证用于编译得到最终制品的代码的安全性。
在一些实施例中,上传模块2120可以根据安全代码扫描模块2110的结果来执行相应的上传。如果安全代码扫描模块2110确定编译获得的中间代码是安全的,则上传模块2120可以将进一步编译得到的可执行文件上传至制品库2040。
进一步地,如果安全代码扫描模块2110确定编译获得的中间代码是安全的,上传模块2120还可以将可执行文件的签名信息上传至制品签名管理模块2060。
相反,如果安全代码扫描模块2110确定当前中间代码具有相应风险,则上传模块2120可以将相关风险上传至问题追踪系统2070,以例如形成风险分析报告。相应地,所编译得到的可执行文件将被禁止上传至制品库2040。
在一些实施例中,代码库2160中的开发代码2140例如还可以在一个受信环境中被提供, 以例如进行人工核查。如果确定开发代码2140存在风险,则该结果同样可以被上报至问题追踪系统2070。
在一些实施例中,如果安全代码扫描模块2110确定当前中间代码具有相应风险,则上传模块2120还可以通知回调模块2130,以在代码库2160中将相应代码标记为风险代码。
在一些实施例中,由TTP维护的问题追踪系统2070例如可以将所接收的风险上报信息发送至开发代码2140的开发者或者维护者,以提醒其当前开发代码2140无法通过安全检查,因此无法被部署。
在一些实施例中,如果开发代码2140通过安全性检查,其可以被编译成可执行文件,并进一步被添加到制品库2040,以例如经由部署网关2050而被部署。
在一些实施例中,在部署从制品库2040获取的制品(即,可执行文件)之前,部署网关2050可以通过制品签名管理系统2060来验证制品的签名是否有效。当制品的签名有效性被确认后,部署网关2050可以将基于开发代码2140所生成的制品部署到网络中。
在一些实施例中,制品例如可以是在客户设备处执行的应用程序,则部署网关2050例如可以将所生成的安装文件(例如,apk文件)发布至相应的应用商店,以供用户下载。由此,本公开的实施例可以保证用户能够下载并安装的安装文件总是由安全代码环境2010经由部署网关2050所发布的。
在一些实施例中,制品例如可以是用于部署到目标应用平台1030中的服务程序。具体而言,目标应用的维护方可能向部署平台发起将特定制品部署到目标应用平台1030中的请求。相应地,在该请求通过审核后,目标应用平台1030可以从制品库2040获取待部署的特定制品,并且对该特定制品的签名进行认证。在该制品的签名通过认证后,该制品例如可以通过以虚拟机或容器的方式被部署到目标应用平台1030中。
由此,基于所讨论的安全计算子系统,本公开的实施例能够从代码上传、代码编写、代码编译、第三方库引用等各个环节来有效地监控从代码转换为真正部署使用的应用程序或服务程序的过程。基于这样的方式,本公开的实施例能够有效地避免在源代码中引入的各种安全漏洞或合规风险。
数据交换子系统
应用的运行会涉及在不同国家、地区管辖的应用平台之间进行数据交互。例如,在图1所示的示例中,期望在目标应用平台1030与同一应用在境外运行的目标应用平台之间交互数据,以提供应用的全球数据交互。如前所述,数据交换子系统(DES)1040可以支持目标应用的公共数据和满足规则的其他数据在不同平台之间进行同步,并且确保所交换的数据的安全性和合规性。总体上,DES 1040被配置为检测在不同平台之间的数据是否满足数据交换约束。数据交换约束可以包括为了满足国家或地区的法律和法规等设置的约束,由于企业、组织和/或用户保护的其他方面的要求而需要设置的约束,等等。
例如,在具有特定数据主权保护要求的国家或地区,可能要由TTP进行涉及数据主权保护的检查。因此,在涉及跨平台的数据交换的很多情况下都需要保护数据交换的安全性和合规性。特别是在设置TTP机房后,外界与TTP机房存储的数据交换会受到约束,希望与TTP方交互的数据会经过数据主权保护的检查。在这样的示例中,数据交换约束可以包括与特定国家或地区的数据主权保护要求相关的规则。
这类的交互数据可以划分为两个方面,一个方面包括平台之间的互通类数据,另一方面包括平台的运维人员对平台的访问或操作等运维类数据。互通类数据主要用于在两个平台之间进行同步,确保应用的功能完整性,这类数据需要经过DES系统以进行安全性和合规性的检查。互通类数据例如包括在线业务数据,离线数据等。运维类数据的检查是要确保运维人员在运维控制面上的操作也是合规的。
图3A示出了根据本公开的一些实施例的在其中部署DES 1040的示例部署环境3001。
在图3A中,TTP方3027指的是在特定国家或地区中需要受到TTP监督和约束的环境。TTP方3027可以涉及用于运行、管理、维护目标应用的各种组件,例如包括业务系统3028、运营平台3029、在线存储3030、离线存储3031等。TTP方3027还包括运维平台3032,运维人员会需要访问运维平台3032,以实现对目标应用的访问、管理或维护等。
类似地,非TTP方3020指的是在特定国家或地区之外的一个或多个其他国家或地区所属的环境,其不受到TTP方3027所处国家或地区的数据交换约束。非TTP方3020可以涉及用于运行、管理、维护目标应用的各种组件,例如包括业务系统3020、运营平台3021、在线存储3022、离线存储3023等。非TTP方3020还包括运维平台3024,运维人员会需要访问运维平台3024,以实现对本地应用或应用平台的访问、管理或维护等。
境内用户流量会流通过TTP方3027的一些组件,境外用户流量会流通过非TTP方3020的一些组件。在本文中,“境内用户流量”指的是在该特定国家或地区管辖的应用平台上产生的用户流量,“境外用户流量”指的是在该特定国家或地区之外的一个或多个其他国家或地区管辖的应用平台上产生的用户流量。
在图3A的环境中,互通类数据包括在TTP方与非TTP方之间交换的境内用户流量与境外用户流量。互通类数据会经过DES 1040,以便进行数据安全性和合规性等方面的检查。此外,还可以设置运营网关3026,以用于对运维类数据执行数据安全性和合规性等方面的检查。
如下文将详细讨论的,在DES 1040中,可以根据数据的类型设置不同的数据通道,以在相应的通道中执行对要交换的数据的检查。图3A示意性示出了一些通道,包括目标对象存储(TOS)通道,消息队列(MQ)通道,离线聚合数据通道,日志(LOG)通道,服务调用通道等。
对于数据交换的双方,可以均有各自的DES来实现数据保护,例如用于对流入数据和/或流出数据的保护。
图3B进一步示出了在TTP方的内部数据中心(IDC)和非TTP所处的境外内部数据中心(RoW IDC)中DES 1040的实现。
在图3B中,TTP IDC 3056指的是针对在特定国家或地区运行的目标应用的IDC,其受到TTP的数据保护检测,境外IDC 3059指的是在特定国家或地区之外的一个或多个其他国家或地区中运行目标应用的IDC,其可能受到其他国家或地区的数据保护约束。
如图3B所示,DES 1040A被实现在TTP IDC 3056中,用于检测外部流入和/或内部流出的数据。DES 1040B被实现在境外IDC 3059中,用于检测外部流入和/或内部流出的数据。DES 1040A和DES 1040B可以被认为是DES 1040的具体部署实例。
从TTP IDC 3056角度来看,从外部流入的数据或内部流出的数据可以包括多种类型的数据,下文将举例描述。
如图3B所示,对于TTP IDC 3056,外部流入的数据可以包括用户请求,例如来自特定国家或地区境内的用户通过境内运行的目标应用3058发起的主动请求。如本文中其他部分将 描述的,在一些实施例中,用户请求还可以经过移动沙盒和/或TTP IDC 3056中的防火墙网关3057等被进行安全性保护。用户请求会到达TTP IDC 3056中的境内应用平台3041进一步处理。在一些示例中,境内应用平台3041可以包括各种服务、供应商网关、存储等等组件。此外,如果用户请求是要被传送到TTP IDC 3056以外的数据中心,该用户请求会被传递到DES 1040A进行数据保护。
在一些实施例中,对于TTP IDC 3056,外部流入的数据还可以包括由供应商3055发起的供应商请求,例如请求境内应用平台的特定服务。例如,第三方供应商可能会调用境内应用平台的应用程序接口(API),例如OpenAPI。由于不能确认第三方供应商是否属于境内用户,供应商请求会经由TTP IDC 3056中的第三方网关3040被送到境内应用平台3041中的供应商网关进行检查,以确定是否是境内用户。如果发起请求的供应商是境内用户,那么供应商请求可以被正常响应。如果发起请求的供应商是境外用户,那么供应商请求会经过DES1040A被传送。
在一些实施例中,对于TTP IDC 3056,外部流入的数据还可以包括从境外IDC 3059同步到TTP IDC 3056的数据。例如,如果对于境外流入数据需要进行数据安全审核时,境外流入数据也需要经过DES 1040A的处理。
在一些实施例中,对于TTP IDC 3056,外部流入的数据还可以包括运维人员对TTP IDC3056的运维操作,例如对TTP IDC 3056的变更。这样操作可以包括代码类变更,配置类变更,日志维护等。代码类变更例如可以包括新功能的上线、bin文件发布等。代码类变更可以由本国际或地区的应用平台的境内运维人员执行。配置类变更可以包括对目标应用的一些设置的启用或禁用,调度的流量配置等。在一些情况下,对于跨国运营的应用平台,可以由境外平台运维人员执行配置类变更。当然,这仅取决于不同应用的管理要求。日志维护指的是对TTP IDC 3056中的日志3044进行维护。
在一些实施例中,境内运维人员或境外运维人员可以网络隔离的条件下对TTP IDC 3056执行运维操作,以进一步确保数据主权保护。如图3B所示,境内运维人员在网络隔离的情况下发起运维操作,运维操作会经由负载均衡器3045进行分配,以分布到TTP IDC 3056中的代码3042处,运维平台3043处或日志3044处。除网络隔离之外,境外运维人员的运维操作会经由运营网关3046来进一步进行安全检查,然后被分布到TTP IDC 3056中的代码3042处,运维平台3043处或日志3044处。
在一些实施例中,对于TTP IDC 3056,内部流出的数据可以包括在应用平台的运行过程中从境内应用平台3041发起的第三方请求,以用于请求第三方服务3054,例如在公共网络中的第三方服务。第三方请求也需要由DES 1040A进行数据保护。
在一些实施例中,对于TTP IDC 3056,内部流出的数据还可以包括从TTP IDC 3056同步到境外IDC 3059的数据。例如,在目标应用运行中,可能会需要将存储在TTP IDC 3056中的用户内容同步到境外IDC 3059。根据数据主权保护的一些规定,这类数据可能是DES1040A需要审核的重点数据。
在一些实施例中,对于TTP IDC 3056,内部流出的数据还可以包括代码同步数据。例如,在一些情况下,出于数据主权保护等方面的检查要求,可能会要求对目标应用或应用平台的代码审核。为了在满足数据主权保护要求的前提下不泄露代码,可能会将代码同步到安全隔离环境3051以供审核。安全隔离环境3051例如可以是不联网的机房,受监控的机房等物理环境,或者具有安全保护的虚拟计算环境,等等。
从境外IDC 3059的角度,其中部署的DES 1040B也会对类似的外部流入数据和内部流出数据进行安全保护。例如,用户通过境外运行的目标应用3058产生的用户请求,在经由负载均衡器3047到达境外应用平台3048(其可以包括各类服务和存储)后,也可以经过DES1040B保护。在运维方面,境外运维人员也可以在网络隔离的情况下经由水晶(crystal)网关3049对境外应用平台3048执行运维操作。这类运维操作也可以经由DES 1040B进行数据保护。
对于在DES 1040A或1040B中进行保护的数据,取决于类型不同,用于执行数据主权保护的方案以及为了实现数据主权保护所需要执行的处理也可能不同。
在本公开的实施例中,在DES 1040(例如,DES 1040A或1040B)中,可以按数据的类型来对数据进行预处理,以将数据的格式统一格式化,从而简化和促进后续关于数据主权保护的检查,加速数据交换过程。
由此,DES 1040中可以按数据类型划分为不同处理部分。例如,按数据来源,DES 1040A中可以包括境内用户数据通道,用于处理特定国家或地区境内用户相关的数据;境外用户数据通道,用于处理境外用户相关的数据;工程技术数据通道,用于处理工程类、运维类数据,诸如代码、参数等各类研发数据、运维数据等。进一步地,取决于数据产生、传输、接收、存储等处理技术,各个通道中的数据还可以被进一步划分。如下文将描述的,按技术划分,不同通道中的数据可以被划分为消息队列(MQ)数据,离线聚合数据,目标对象存储(TOS)数据,服务调用数据中的一项或多项,或者其他类型的数据。
对于在数据主权保护审核中通过的数据,可以从统一格式化格式的数据转换回原始格式的数据,并提供到相应的目的地。根据本公开的方案,由于数据来源不同,不同类型的数据在数据格式、处理技术等方面都各有不同,通过统一格式化的预处理和后处理,可以降低后续在数据主权保护的审核阶段的复杂度。此外,随着数据来源的更新和技术扩展/变化等,可以只需要改变数据的预处理和后处理,而不会对数据交换约束确定阶段的处理进行复杂改动。由此,数据交换架构具有极大的灵活性和可扩展性。
下文将参考附图来详细描述一些具体实施例。
DES的整体架构和数据流
图3C示出了根据本公开的一些实施例的DES 1040的示例架构的框图。在图3C的示例中,DES 1040被示出为对目标应用在境内应用平台3041与外部应用平台(统称为境外应用平台3048)之间同步数据,并执行数据交换约束的确定。
如图3C所示,DES 1040可以包括DES适配器3061,DES中心和DES适配器3070。DES中心可以包括针对不同类型数据通道的DES中心,例如针对境内用户数据的DES中心3065A,针对境外用户数据的DES中心3065B,以及针对工程技术数据的DES中心3065C等。DES中心3065A、3065B和3065C具有不同同步能力。在下文中,有时为了便于描述,DES中心3065A、3065B和3065C可以被统称为DES中心3065。
DES适配器3061与境内应用平台3041相连,用于从境内应用平台3041接收要同步并且要经由DES 1040检测的数据,以及将从境外应用平台3048接收到并且经过DES 1040检测的数据发送给境内应用平台3041。DES适配器3070与境外应用平台3048相连,用于将境内应用平台3041接收到并且经过DES 1040检测的数据发送给境外应用平台3048,并且从境外应用平台3048接收要同步并且要经由DES 1040检测的数据。DES适配器3061和DES适配 器3070均与DES中心3065互连,以向DES中心3065传送数据。
各个DES中心3065被配置为利用数据交换约束来检测数据,以确保在两个应用平台之间交换的数据的安全性和合规性。通常,满足数据交换约束的数据会通过DES 1040被传递到相应的目的地,而不满足数据交换约束的数据可能会被DES 1040驳回。
DES适配器3061和3070可以被配置为对要传入DES中心3065的数据执行预处理和后处理,以使DES中心3065在各个数据类型对应的统一格式化数据基础上进行关于数据交换约束是否被满足的确定。
在一些实施例中,DES 1040中的DES适配器3061和DES中心3065可以与境内应用平台3041一起被实现在TTP IDC 3056中,DES适配器3070可以与境外应用平台3048一起被实现在境外IDC 3059中。
在一些实施例中,可以隔离DES 1040中的不同组件,以进一步确保更有效的数据隔离。这样的数据隔离可以通过将不同组件部署在不同数据中心来实现。在一些实施例中,可以通过应用虚拟私有数据中心(VPC)技术来实现数据隔离。例如,如图3C所示,DES适配器3061可以被实现在VPC1中,各个DES中心可以被实现在VPC2中,DES适配器3070可以被实现在VPC3中。DES中心3065中的数据安全性和合规性的确定可以由TTP执行。在数据隔离的情况下,VPC1和VPC3不具有直接通信连接,但VPC1和VPC3分别与VPC2具有直接通信连接,可以彼此通信数据/信息。通过VPC技术所带来的数据隔离,部署在VPC2的DES中心3065可以是TTP信任的区域(称为TTP受信区域)。
在一些实施例中,DES适配器3061可以包括DES入口3062,其可以实现控制面的处理,例如由运维人员申请建立和管理数据通道,注册规则等,并且可以由TTP查看通道中的数据。DES适配器3061还可以包括DES代理(proxy)3063,其可以实现数据面的处理,例如数据验证、数据过滤、数据转换、数据采样、日志检测,等等。类似地,在一些实施例中,DES适配器3070可以控制面的DES入口3072和数据面的DES代理3073。
在一些实施例中,对于境内用户数据通道,DES中心3065A可以包括DES注册中心,用于注册数据交换约束、配置数据等等。DES中心3065A还可以包括进一步细分的通道,包括针对服务调用数据的服务调用通道,针对MQ数据的MQ通道,针对离线聚合数据的HDFS通道(其中,HDFS称为Hadoop分布式文件系统)以及针对TOS数据的TOS通道。离线聚合数据例如包括高度并行集成虚拟环境(HIVE)类型的数据。
服务调用数据例如可以包括利用各种网络协议或调用协议,例如HTTP协议或RPC协议来进行远程服务调用的数据。MQ数据可以包括支持MQ协议以及类似协议的数据,例如包括各类数据库(例如,MySQL,Redis数据库)中存储的数据。离线聚合数据可以包括基于HDFS技术的文件系统中的数据,以及基于其他技术的文件系统中的数据。TOS数据包括对象文件,例如视频、音频、图像、文档、以及其他媒体文件。
在一些实施例中,虽然图3C中未示出,对于境外用户数据通道的DES中心3065B和对于工程技术数据通道的DES中心3065C,也可以包括与DES中心3065A类似的组件。
图3D示出了根据本公开的一些实施例的数据交换过程300的流程图。过程3004可以被实现在DES 1040中。
如图3D所示,在框3301,DES 1040获取目标应用要在第一平台(例如,境内应用平台3041)与第二平台(例如,境外应用平台3048)之间交换的原始数据。取决于交换的方向,原始数据可以来自第一平台,并可以由DES 1040中的DES适配器3061接收到。或者,原始 数据可以来自第二平台,并可以由DES 1040中的DES适配器3070接收到。
在框3302,DES 1040基于原始数据的类型来处理原始数据,以获得该类型对应的统一格式化数据。原始数据的处理(此处的处理也可称为预处理)可以根据原始数据的类型来确定。原始数据的类型例如可以包括MQ数据、离线聚合数据、TOS数据或服务调用数据等。进一步地,在一些情况下,原始数据的处理也可以按数据来源的不同来确定。例如,根据数据来源,原始数据可以被划分为境内用户数据,境外用户数据或工程技术数据。不同类型的数据对应的格式不同,并且可以应用不同方式来产生对应的统一格式化数据。
在一些实施例中,由于数据源所使用的技术不同,同一类型的数据可能在不同格式下被提供,这增加了对技术处理的要求。因此,可以指定一个统一格式。在预处理阶段,可以通过格式转换将原始数据的格式转换为该类型下的指定格式,以获得统一格式化数据。
例如,对于MQ数据,可以将不同格式的MQ数据进行解析,以便分析由不同格式封装的消息中的内容。对于离线聚合数据和TOS数据,可以响应于从不同格式下的文件系统或数据系统调用这些数据的不同请求,转换为通过统一API来实现的文件调用请求。对于服务调用,可以将不同协议下生成的服务调用请求转换为统一协议中的服务调用请求。
对于不同类型的数据的具体预处理方式,下文将更详细描述。
在框3303,DES 1040从统一格式化数据确定对数据交换约束的满足。例如,DES 1040中的DES中心3065,特别是对应数据类型的DES中心3065可以执行对数据交换约束的满足与否的检查。经过统一格式化的预处理,DES中心3065不需要应用各种不同技术来解析原始数据,从而更方便地利用规则来执行数据安全性和合规性的检查。
在框3304,如果确定统一格式化数据满足数据交换约束,DES 1040将统一格式化数据转换为原始数据。在满足数据交换约束的情况下,数据被允许在平台之间同步。为了确保数据正确同步,DES 1040会进一步处理中间生成的统一格式化数据(即,执行后处理阶段),以将统一格式化数据转换为原始数据,其具有原始的格式。
在框3305,DES 1040执行原始数据在第一平台与第二平台之间的交换。由此,可以实现在满足安全性和合规性情况下的数据交换。
在一些实施例中,如上文简单提及的,可以在不同平台之间创建与不同类型的原始数据分别对应的多个数据通道,不同类型的原始数据将会被传递到对应的数据通道中进行处理。每个数据通道可以包括适合处理该类型的原始数据的预处理组件、后处理组件和关于数据交换约束的确认组件。附加地或备选地,每个数据通道可以被注册有要被应用到该特定类型的原始数据的数据交换约束。通过这样的方式,可以实现对不同类型的数据的预处理、数据交换约束的确认和后处理方面的分离。
与不同类型的数据对应的数据通道可以被灵活地创建、更新和删除。这样,如果数据的预处理和后处理方式发生变化,或者针对数据的特定类型的数据交换约束需要更新,都可以在相应数据通道中执行,而不会影响到其他数据通道。此外,根据业务需要,如果要在第一平台与第二平台之间交换的新类型的原始数据并且该新类型的数据也要执行关于数据主权保护的检查,那么可以灵活地在第一平台与第二平台之间创建新的数据通道用于处理新类型的原始数据。
图3E示出了根据本公开的一些实施例的在DES 1040处实现的各类数据处理的示例数据流3005的流程图。数据流3005涉及控制面的数据流和数据面的数据流。
在控制面,可以由运维人员在DES 1040中配置一个或多个类型的数据的通道,并可以实 现对通道的更新和维护等。如图3E所示,境内运维人员可以经由DES入口3062请求配置特定数据类型和用于处理特定数据类型的通道,并将指示特定数据类型的数据目录3081和对特定数据类似的数据定义3082注册到DES注册中心3066。数据定义3082可以指定在DES 1040中对不同类型的数据进行处理的通道信息,并且可以包括关于对相应类型的数据的预处理方案、后处理方案等。
类似的,境外运维人员也可以经由DES入口3072请求配置特定数据类型和用于处理特定数据类型的通道。境外运维人员也可以将指示特定数据类型的数据目录3084和对特定数据类似的数据定义3085注册到DES注册中心3066。数据定义3085可以指定在DES 1040中对不同类型的数据进行处理的通道信息,并且可以包括关于对相应类型的数据的预处理方案、后处理方案等。
在数据面,不同类型的数据在DES 1040中会经过各自的通道。如图3E所示,对于服务调用数据,在TTP IDC侧的客户端或服务器3086与境外IDC侧的客户端或服务器3090之间交换服务调用请求。为了使服务调用请求满足数据主权保护要求,服务调用请求在DES 1040中的服务调用通道被处理。
在图3E的示例中,服务调用通道可以至少包括DES代理3063中的预处理模块3087、DES中心3065中的HTTP代理3088,以及DES代理3073中的路由模块3089。来自TTP IDC侧的客户端或服务器3086的服务调用请求被传送到预处理模块3087。预处理模块3087利用来自数据定义3082中所规定的数据预处理方案来处理服务调用请求,并将统一格式化后的服务调用请求发送给HTTP代理3088。
在这个示例中,假设服务调用请求被统一格式化为符合统一协议,即HTTP协议的请求。因此,HTTP代理3088可以在确定统一格式化后的服务调用请求满足数据交换约束后,将统一格式化后的服务调用请求通过路由模块3089提供到另一侧的客户端或服务器3090。在被提供到客户端或服务器3090之前,统一格式化后的服务调用请求被转换回到符合原始协议的服务调用请求。
对于MQ数据,这个类型的原始数据在DES 1040中的MQ通道被处理。在图3E的示例中,MQ通道可以至少包括DES代理3063中的预处理模块3092、DES中心3065中的MQ传送器3094,以及DES代理3073中的路由模块3097。
针对MQ类型的原始数据3091被传送到预处理模块3092。预处理模块3092利用来自数据定义3082中所规定的数据预处理方案来处理原始数据3091,得到统一格式化数据3093。统一格式化数据3093由MQ传送器3094提取,例如经由第三方软件开发工具包(SDK)提取。在经过数据交换约束的检查后,由SDK将满足规则的统一格式化数据3096推送到境外IDC。不满足数据交换约束的统一格式化数据3095被驳回。路由模块3097将满足规则的统一格式化数据3096路由到对应的目的地,在被传输到目的地之前统一格式化数据3093被转换回对应的原始数据3098。
对于离线聚合数据和TOS数据,原始数据分别会在DES 1040中的HDFS通道和TOS通道被处理。为简化,图3E示出了一个通道的示例,但可以理解,HDFS通道和TOS通道可以包括图示的组件。在图3E的示例中,HDFS通道或TOS通道可以至少包括DES代理3063中的预处理模块3100、DES中心3065中的文件传送器3103,以及DES代理3073中的路由模块3105。
由于离线聚合数据类型或TOS类型的数据被存储在文件系统或其他存储系统中,预处理 模块3100可以向文件传送管理器3102发起调用文件传送API的请求,以获得针对离线聚合数据类型或TOS类型的原始数据3099被传送到预处理模块3100。预处理模块3100可以利用来自数据定义3082中所规定的数据预处理方案来处理原始数据3099,得到统一格式化数据3101。
与MQ类型的数据处理类似,统一格式化数据3101由文件传送器3103提取,例如经由SDK提取。在经过数据交换约束的检查后,由SDK将满足规则的统一格式化数据3104推送到境外IDC。不满足数据交换约束的统一格式化数据会被驳回,无法被传送到境外IDC。路由模块3105将满足规则的统一格式化数据3104路由到对应的目的地,在被传输到目的地之前统一格式化数据3094被转换回对应的原始数据3106。
应当理解,图3E仅示出了在DES 1040中对从TTP IDC到境外IDC的流出数据的处理。对于相反方向的数据流,在DES 1040中也可以通过类似的流程处理,并且DES 1040也可以保留对应的组件用于支持相应的处理,特别是在DES适配器中的组件。
下文将针对DES 1040中不同类型的数据的一些示例实现进行详细讨论。
针对MQ的数据交换的示例实现
图3F示出了根据本公开的一些实施例的涉及MQ通道的数据交换架构3006的示意框图。数据交换架构3006可以被实现在DES 1040中,用于针对MQ类型的数据执行数据安全保护。在图3F的示例中,示出了从TTP IDC到境外IDC方向的数据交换。
如图3F所示,TTP IDC中的源数据库3110产生要传送的MQ数据的实体。MQ数据可以包括改变数据或者商业定制事件等消息,不同消息可以具有不同的格式。源数据库3110产生的MQ数据被放入源消息队列3112中。
在图3F的示例中,在DES适配器3061中除DES入口3062外,还包括DES前适配器3120。DES前适配器3120可以被实现为DES代理3063的一部分,以用于对从TTP IDC到境外IDC方向的MQ数据进行预处理。DES前适配器3120可以被配置为将不同格式的MQ数据处理为具有统一格式的统一格式化MQ数据,并将统一格式化MQ数据提供给MQ传送器3094,以执行关乎数据交换约束是否满足的确定。
MQ数据(或消息)也可以包括由不同协议生成的数据,每个协议下的数据具有定制格式,因此需要不同的预处理。如图3F所示,DES前适配器3120可以包括解析器3122,其被配置为对不同类型的原始MQ数据进行解析,以将不同类型的原始MQ数据转换统一格式的统一格式化MQ数据。如图3F所示,DES前适配器3120可以包括MySQL解析器,用于解析通过MySQL协议生成的数据,例如变化数据捕获(CDC)数据;Redis解析器,用于解析通过Redis协议生成的数据,例如CDC数据;文档解析器,用于解析文档数据库中的数据,特别是CDC数据;图解析数据,用于解析图(graph)数据库的数据,特别是CDC数据;MQ解析器,用于解析通过消息队列发送的不同类型的业务事件数据,等等。可以理解,解析器3122是可灵活缩放的,其中可以设置更多、更少或其他的解析器,用于解析相应类型的MQ数据。
解析后得到的统一格式化MQ数据也可以是消息队列的形式,可以被放入统一格式化消息的队列3124中。在TTP IDC的VPC2中,负责MQ数据的MQ传送器3094可以通过SDK从统一格式化消息的队列3124提取解析后的统一格式化MQ数据用于进行数据安全性和合规性检查。不满足数据
的统一格式化MQ数据被MQ传送器3094驳回,并被记录在驳回日志3126中。满足数据交换约束的统一格式化MQ数据经由SDK被推送到DES适配器3070中的DES后适配器3130。
DES后适配器3130可以被实现为DES代理3073的一部分,以用于对从TTP IDC到境外IDC方向的统一格式化MQ数据进行后处理,以将数据传送到目的地。满足数据交换约束的统一格式化MQ数据经由SDK被推送到DES后适配器3130。
DES后适配器3130可以包括数据回放器3132,用于对统一格式化MQ数据执行后处理。具体地,DES后适配器3130可以被配置为将统一格式化MQ数据转换为原始MQ数据。因此,DES后适配器3130可以包括与不同类型的MQ数据相对应的回放器(replayer),用于执行从统一格式到各自的定制格式的转换。如图3F所示,DES后适配器3130可以包括MySQL回放器,用于将统一格式化MQ数据转换为符合MySQL协议的MQ数据;Redis回放器,用于将统一格式化MQ数据转换为符合Redis协议的MQ数据;文档回放器,用于将统一格式化MQ数据转换为图形式的原始数据;MQ回放器,用于将统一格式化数据转换为符合MQ协议的原始数据,等等。
转换后的原始MQ数据被放入DES后适配器3130中的统一格式化消息的队列3134,并从中可以被同步到目标消息队列3135。目标消息队列3135用于存放从源消息队列3112经由DES 1040间接同步过来的MQ数据。目标数据库3136可以从目标消息队列3135获得期望的MQ数据。
图3F中仅示出了从TTP IDC到境外IDC方向的数据交换所涉及的组件。对于图3F的示例中,示出了从境外IDC到TTP IDC方向的数据交换,DES 1040中可以包括类似的组件用于处理这个方向的数据交换,例如DES适配器3070可以包括具有与DES前适配器3120类似功能的DES前适配器,并且DES适配器3061可以包括具有与DES后适配器3130类似功能的DES后适配器。为简化目的,这个方向的处理不再详细展开。
可以理解,图3F中示出的在DES中用于处理MQ数据交换的组件仅是示例。在其他示例中,取决于需要,不同功能模块还可以按其他方式被细分、合并等,并且还可以包括更多、更少或不同的功能模块。
针对离线聚合数据的数据交换的示例实现
图3G示出了根据本公开的一些实施例的涉及HDFS通道的数据交换架构3500的示意框图。数据交换架构3500可以被实现在DES 1040中,用于针对离线聚合数据执行数据安全保护。在图3G的示例中,示出了在TTP IDC侧的HDFS 3502与境外IDC侧的HDFS 3504之间的离线聚合数据交换。在HDFS 3502与HDFS 3504中的一些离线聚合数据可能需要彼此同步。
如图3G所示,在数据交换架构3500中,TTP IDC侧的数据传送检测器3510负责检测HDFS 3502中是否存储了需要被传送到另一侧的HDFS 3504的离线聚合数据。在发现要传送的离线聚合数据的情况下,数据传送递交器3520可以向文件传送器3550递交数据传送的请求。在被递交到文件传送器之间,数据预处理模块3530被配置为对数据执行预处理,以将离线聚合数据处理为统一格式化数据。
在文件传送器3550中,数据传送服务器3556被配置为基于利用数据交换约束来控制数据传送服务。如果数据传送服务器3556确定来自HDFS 3502的预处理后的统一格式化数据 符合数据交换约束,那么可以调用传送工作3558,以将统一格式化数据通过传送工作3558下的传送任务3562传送到境外IDC。在一些实施例中,传送工作3558还可以可选地包括数据验证任务3560,其可以被配置为根据需要执行数据验证。统一格式化数据经过HDFS网关3564,并可以被执行后处理后,得到原始的离线聚合数据,并被存入HDFS 3504。
类似的,在数据交换架构3500中,境外IDC侧的数据传送检测器3570负责检测HDFS3504中是否存储了需要被传送到TTP IDC侧的HDFS 3502的离线聚合数据。在发现要传送的离线聚合数据的情况下,数据传送递交器3572可以向文件传送器3550递交数据传送的请求。在被递交到文件传送器之间,数据预处理模块3570被配置为对数据执行预处理,以将离线聚合数据处理为统一格式化数据。
在文件传送器3550中,如果数据传送服务器3556确定来自HDFS 3504的预处理后的统一格式化数据符合数据交换约束,那么可以调用传送工作3554,以将统一格式化数据通过传送工作3554下的传送任务3552传送到TTP IDC。统一格式化数据经过后处理后,得到原始离线聚合数据,并被存入HDFS 3502。
可以理解,图3G中示出的在DES中用于处理离线聚合数据交换的组件仅是示例。在其他示例中,取决于需要,不同功能模块还可以按其他方式被细分、合并等,并且还可以包括更多、更少或不同的功能模块。
针对对象存储的数据交换的示例实现
总体而言,TOS通道可以确定对象文件是否满足数据交换约束、以及在约束满足的情况下将对象文件从源IDC(例如,TTP IDC或境外IDC)复制到目的地IDC(例如,境外IDC或TTP IDC)。对象文件例如是视频、音频、图像、文档、或者其他媒体文件。
在一些实施例中,可以通过API从对象存储复制对象文件,执行数据交换约束的确定,并且利用API将对象文件推送到目的地端的对象存储。在对象文件的数据交换中,通过与对象文件对应的复制请求来确定对数据交换约束的满足。下文将参考图3H至图3J来描述TOS通道的细节。
图3H示出了根据本公开的一些实施例的数据从TTP IDC复制到境外IDC的目标对象存储(TOS)通道3600的示意图。在该示例中,要交换的数据是对象文件,其被存储在TTP IDC中的对象存储3606,并且期望要被交换到境外IDC的对象存储3607。
在图3H中,TTP IDC中的API 3605被配置为将复制请求推送给工作节点3605,并且从工作节点3605接收从另一侧的境外IDC交换过来的复制结果。如图所示,在数据流开始3601时,针对要交换的对象文件的复制请求由API(也可称为DES-TOS API)3602传送给工作节点3605。该复制请求可以指示要交换的对象文件相关的信息,例如对象文件的格式(视频、音频、文本等)、对象文件的标识符、以及其他文件元数据等。该复制请求具有统一格式。
受信区域VPC2内的工作节点3605被配置为响应于针对对象文件的复制请求,执行关于数据交换约束的确定。具体地,工作节点3605可以从统一格式化的复制请求中确定要交换的对象文件是否满足数据交换约束。
在一些实施例中,在TTP IDC侧,可以在初始阶段或者后续需要的时候发起数据交换约束的注册。在约束注册开始3622时,可以通过TTP IDC中的DES入口3620,向TTP受信任区域中的DES注册中心3624注册要使用的数据交换约束。数据交换约束的注册可以通过调用API 3602来实现。工作节点3605可以通过DES注册中心3624来访问当前要使用的数据 交换约束。
在一些实施例中,数据交换约束可以指示允许交换的对象文件的白名单或者不允许交换的对象文件的黑名单,在每个名单中可以按对象文件的格式、标识符等来标识允许或不允许交换的文件对象。
在执行数据交换约束,工作节点3605允许满足数据交换约束的复制请求的执行。如果复制请求允许被执行,工作节点3605访问TTP IDC中的对象存储3606以将对象文件复制到境外IDC中的对象存储3607。对于非法请求(即不满足数据交换约束的复制请求),它们将被拒绝,从而无法被执行。工作节点3605可以将复制的对象文件经由境外IDC中的API 3610来写入对象存储3607。这样,数据流结束3611。
图3I示出了根据本公开的一些实施例的数据从境外IDC复制到TTP IDC的TOS通道3650的示意图。在该示例中,要交换的对象文件其被存储在境外TTP IDC中的对象存储3607,并且期望要被交换到TTP IDC的对象存储3606。
在图3I中,境外IDC中的API 3610被配置为将复制请求推送给工作节点3605,并且从工作节点3605接收从另一侧的TTP IDC交换过来的复制结果。如图3I所示,在数据流开始3651时,针对要交换的对象文件的复制请求由API 3610传送给工作节点3605。该复制请求可以指示要交换的对象文件相关的信息,例如对象文件的格式(视频、音频、文本等)、对象文件的标识符、以及其他文件元数据等。该复制请求具有统一格式。受信区域VPC2内的工作节点3605可以从统一格式化的复制请求中确定要交换的对象文件是否满足数据交换约束。
在一些实施例中,在境外IDC侧,可以在初始阶段或者后续需要的时候发起数据交换约束的注册。在约束注册开始3632时,可以通过境外IDC中的DES入口3630,向TTP受信任区域中的DES注册中心3624注册要使用的数据交换约束。数据交换约束的注册可以通过调用API 3610来实现。工作节点3605可以通过DES注册中心3624来访问当前要使用的数据交换约束。
在执行数据交换约束,工作节点3605允许满足数据交换约束的复制请求的执行。如果复制请求允许被执行,工作节点3605访问境外IDC中的对象存储3607以将对象文件复制到TTP IDC中的对象存储3606。对于非法请求(即不满足数据交换约束的复制请求),它们将被拒绝,从而无法被执行。工作节点3605可以将复制的对象文件经由TTP IDC中的API 3602来写入对象存储3606。这样,数据流结束3652。
可以理解,图3H和图3I中示出的在DES中用于处理TOS数据交换的组件仅是示例。在其他示例中,取决于需要,不同功能模块还可以按其他方式被细分、合并等,并且还可以包括更多、更少或不同的功能模块。
图3J示出了根据本公开的一些实施例的在TOS通道中的消息序列3012。图3J中的消息序列3012涉及TTP 3701、运维人员3702、平台工作人员3703、DES入口3704、API 3705、工作节点3605和对象存储3708。
取决于数据交换的方向,图3J中的DES入口3704、API 3705和对象存储3708可以是图3H和图3I中任一中的对应组件。例如,在图3H所示的从TTP IDC复制到境外IDC的TOS通道3600中,DES入口3704包括图3H所示的DES入口3620,API 3705包括图3H中的API 3602,对象存储3708包括图3H中的对象存储3606。在从境外IDC复制到TTP IDC的TOS通道3650中,DES入口3704包括图3I所示的DES入口3630,API 3705包括图3I中的API 3610,对象存储3708包括图3I中的对象存储3607。
在消息序列3012中,运维人员3702向DES入口3704注册3711数据交换约束,其可以约束对象文件在不同IDC的对象存储3606和3607之间的复制。在完成注册后,DES入口3704可以向运维人员发送3714响应。DES入口3704向API 3705注册3712关于数据交换约束的容器信息,并且在注册完成后API 3705可以向DES入口3704发送3713响应。经由DES入口3704注册的规则可以被高速缓存3715到API 3705,并且也可以被高速缓存3716到工作节点3605。
平台工作人员3703可以向API 3705发起3717对对象文件的复制请求。API 3705可以执行认证3718。工作节点3605可以从API 3705拉取3719复制请求,并且对要复制的对象文件执行3720数据交换约束的确定。如果允许复制对象文件,工作节点3605执行3721文件复制,以从对象存储3706复制对应的对象文件。无论不满足数据交换确定的结果如何,工作节点3605会向API 3705返回3722反馈。在允许复制对象文件的情况下,反馈包括所复制的对象文件。在不允许复制对象文件的情况下,反馈用于指示复制请求被拒绝。
在一些实施例中,平台工作人员3703可以回调3723API 3705,从API 3705可以向平台工作人员3703返回3724复制请求ID。在一些实施例中,TTP 3701可以通过DES入口3704来查看3725历史对象文件复制的情况,以确认在过去一段时间内对象文件的交换是否符合数据交换约束的要求。DES入口3704可以返回3726所要查看的结果。
针对服务调用的数据交换保护的示例实现
图3K示出了根据本公开的一些实施例的涉及服务调用通道的数据交换架构3800的示意框图。数据交换架构3800可以被实现在DES 1040中,用于针对服务调用类型的数据执行数据安全保护。在图3L的示例中,示出了在TTP IDC侧的目标平台服务3802与境外IDC侧的境外(非TTP)平台服务3804之间的服务调用数据交换。例如,目标平台服务3802上的服务可能需要调用境外平台服务3804上的服务,反之,境外平台服务3804的服务也可能需要调用目标平台服务3802上的服务。
不同服务平台可能会应用多种不同的服务调用协议,例如HTTP协议或Thrift RPC协议。在本公开的一些实施例中,希望在VPC受信区域中执行数据主权保护时可以处理统一格式化数据,例如HTTP协议数据。
在图3K中,在控制面,非TTP控制面用于通道注册、通道架构更新、检测;TTP/TTP控制面用于通道请求批准、通道禁止、通道检测等。在数据面,HTTP负载均衡器3810是来自TTP Cloud的L7均衡产品,其是确保所有DES-RPC通道流量通过VPC受信区域的关键组件。HTTP通道是DES-RPC通道中支持HTTP协议的通道。Thrift RPC通道是DES-RPC通道中支持Thrift RPC协议的通道。在被发送到TTP的HTTP负载均衡器之前,Thrift RPC通道将被包裹在HTTP通道中。
在通道注册阶段,DES-RPC通道利用通道信息和数据定义来声明。通道信息可以包括通道的类型,例如Thrift RPC或HTTP。通道信息还可以包括RPC调用元组。调用元组可以包括src dc、src服务、dst dc、dst服务、rpc方法/http路径。
数据定义可以取决于数据的流动方向。对于从非TTP到TTP的数据流动,将使用具有合规注释的Thrift IDL来声明响应。对于从TTP到非TTP的数据流动,将使用带有合规注释的Thrift IDL来声明请求。在一些实施例中,在DES-RPC通道通过合规性注册时,DES-RPC通道才是可用的。
可以理解,图3K中示出的在DES中用于处理服务调用数据交换的组件仅是示例。在其他示例中,取决于需要,不同功能模块还可以按其他方式被细分、合并等,并且还可以包括更多、更少或不同的功能模块。
图3L示出了根据本公开的一些实施例的在图3K所示的服务调用通道中从非TTP到TTP的数据交换示例。如图3L所示,由境外区域的服务A 3901发起的调用将由HTTP代理3902或Thrift代理3903转发到TTP的HTTP负载均衡器3905。服务A 3901可以是图3M所示的境外平台服务的一个示例。对于HTTP请求,调用将由HTTP代理3902转发到HTTP负载均衡器3905。对于Thrift请求,调用将由Thrift代理3903转发到HTTP负载均衡器3905。
在一些实施例中,到VPC受信区域中的HTTP负载均衡器3905的服务发现可以通过DNS来实现,而从HTTP负载均衡器3905到对应IDC流量代理的服务发现可以用其他定制/通用的服务发现机制来实现。
境外IDC中对应服务代理,比如说HTTP代理3902或者Thrift代理3903到VPC受信区域HTTP负载均衡器3905的服务发现建议通过DNS来实现,对应请求转发到TTP IDC区域的代理的服务发现则建议使用定制/通用的服务发现。
在VPC受信区域中,TTP的HTTP负载均衡器3905将请求分别转发到TTP的HTTP代理3907和Thrift代理3908,然后HTTP代理3907和Thrift代理3908分别将请求转发到作为目标服务的服务B 3908和服务C 3910。对于Thrift rpc调用,Thrift代理3908会在发送请求之前从所生成的新的HTTP请求中恢复原始Thrift请求。
TTP的HTTP代理3907和Thrift代理3908将在向TTP的HTTP负载均衡器3905发送响应之前对响应进行检查。对于未通过合规性检查的响应,将返回错误。此外,对于Thrift rpc调用,Thrift响应将被HTTP包裹以生成新的HTTP响应。新的HTTP响应的主体为Thrift二进制文件。
图3M示出了根据本公开的一些实施例的在图3K所示的服务调用通道中从TTP到非TTP的数据交换示例。如图3M所示,由TTP的服务A 3951发起的调用将由TTP的HTTP代理3952和Thrift代理3953转发到TTP的HTTP负载均衡器3955。对于HTTP请求,调用将由HTTP代理3952转发到HTTP负载均衡器3955。对于Thrift请求,调用将由Thrift代理3953转发到HTTP负载均衡器3955。
对于非法的请求,将返回错误。对于未通过合规性检查的响应,将返回错误。对于Thrift rpc调用,请求将被HTTP包裹以生成新的HTTP请求。新的HTTP请求的主体为Thrift二进制文件。
TTP的HTTP负载均衡器3955将请求转发到非TTP(也即,境外区域)的HTTP代理3957和Thrift代理3958。然后HTTP代理3957和Thrift代理3958将请求转发到境外区域的服务B 3959和服务C 3960。
对于Thrift rpc调用,Thrift代理会在发送之前从所生成的新的HTTP请求中恢复原始Thrift请求。
非TTP的HTTP代理3957和Thrift代理3958将向TTP的HTTP负载均衡器3955发送响应。对于Thrift rpc调用,Thrift响应将被HTTP包裹以生成新的HTTP响应。新的HTTP响应的主体为Thrift二进制文件。
安全沙盒子系统
客户端应用需要与服务器通信以传输数据。客户端应用的网络流量可以传输大量的用户数据。因此,需要一种能够管理客户端应用的网络流量的方法,以使得用户数据不会经由客户端应用的网络流量被传输到未经允许的服务器。例如,在数据主权保护的场景下,该方法可以防止用户数据被传输到非数据主权国家的服务器。
然而,客户端应用的网络流量的类型十分丰富。客户端应用可以包括移动端应用和电脑(PC端)应用。客户端应用的网络流量可以包括原生类型的网络流量和网页视图类型的网络流量等。此外,客户端应用的网络流量并非都在应用的所有者的管理和控制之下。例如,客户端应用的网络流量可以包括来自第三方广告商的网络流量。因此,管理客户端应用的各种类型的网络流量是十分困难的。
本公开的示例实施例提出了一种管理客户端应用的网络流量的方法。该方法包括:基于对目标用户的确定,检测目标用户的用户数据从客户端应用到服务器的网络传输;基于网络传输对应的网络流量的类型,在网络传输的不同层分析网络流量;以及基于分析指示网络流量满足与目标用户对应的数据交换约束,将网络流量发送到由数据交换约束限定的服务器。
以此方式,通过基于网络流量的类型在网络传输的不同层分析网络流量以及限制不满足数据交换约束的网络流量的传输,可以有效地防止用户数据经由各种类型的网络流量传输到未经允许的服务器。
以下将参照附图来具体描述本公开的实施例。下文将以移动端应用为例来示例性地说明本公开的方案。
图4A示出了根据本公开的一些实施例的管理移动端应用的网络流量的示例方法4100的流程图。该方法4100例如可以在图1的安全沙盒子系统1090处实施。移动端应用可以是移动端的目标应用1080。
在框4102,基于对目标用户的确定,检测目标用户的用户数据从目标应用1080到服务器的网络传输。换言之,如果确定当前用户是目标用户,则安全沙盒子系统1090可以检测目标用户的用户数据的网络传输。
在一些实现中,可以基于对目标用户的确定将网络流量路由到安全沙盒子系统1090,以使得安全沙盒子系统1090可以检测和分析与用户数据的网络传输对应的网络流量。安全沙盒子系统1090可以分析目标应用1080的网络请求并且基于数据交换约束来限制不满足条件的网络请求。
数据交换约束可以包括与数据主权有关的交换约束,例如数据主权保护规则。数据主权保护规则可以根据各个国家或区域的规定而被确定。数据主权保护规则也可以由应用的运营方确定(例如,与用户数据使用协议有关)。
数据主权保护规则可以基于具体场景而被设置。例如,数据主权保护规则可以规定不允许数据主权国家的用户数据传输到数据主权国家之外的任何服务器。在另一些实现中,数据主权保护规则可以规定不允许数据主权国家的隐私的用户数据传输到未经注册的任何服务器。本公开的范围对此不作限制。
如图1所示,目标应用1080的网络请求经由安全沙盒子系统1090的分析和处理之后被传输到应用防火墙子系统1020。安全沙盒子系统1090的原理和细节将在下文详细描述。
目标用户是指其用户数据的传输需要被检测和管理的用户。目标用户可以是具有数据主权国家的国籍的用户。备选地或附加地,目标用户也可以是根据数据主权保护的具体规则而 确定的用户。例如,目标用户可以是具有数据主权国家的国籍并且当前在地理上位于该数据主权国家中的用户。
在一些实现中,可以基于用户信息来确定目标用户。用户信息可以包括用户的账号信息、个人信息、注册信息等。备选地或附加地,可以基于设备信息来确定目标用户。设备信息可以包括用户身份识别模块(Subscriber Identity Module,SIM)信息、IP地址、网络服务提供商信息、设备的系统设置信息、应用设置信息等。
在一些实现中,可以基于多种信息的组合来确定目标用户。多种信息可以具有不同的优先级。例如,SIM信息、网络服务提供商信息的优先级可以高于IP地址、系统设置信息、应用设置信息等。
在一些实现中,对目标用户的确定可以基于对目标用户所在地区的确定。可以利用上述用户信息或设备信息来确定目标用户的所在地区,从而确定目标用户。例如,可以利用智能手机的系统设置中的地区设置来确定当前用户所在地区,并以此确定当前用户是否为目标用户。又例如,可以利用SIM卡的国家代码来确定目标用户所在地区,并以此来确定目标用户。
在一些实现中,可以在初次启动应用时确定目标用户。换言之,可以在初次启动应用时确定当前用户是否是目标用户。备选地或附加地,可以在用户注册时确定当前用户是否是目标用户。备选地或附加地,可以在用户登入、登出、切换账号时确定当前用户是否是目标用户。
在一些实现中,可以将确定结果存储在本地或服务器中。可以在第一次将用户确定为目标用户之后存储确定结果并且设置在阈值时间段内使用所存储的确定结果。这样,当用户再次登录时,可以无需再次对用户进行确定。
在框4104,基于网络传输对应的网络流量的类型,在网络传输的不同层分析网络流量。
目标应用1080中的网络流量可以包括多个类型的网络流量,例如原生(native)、网页视图(Webview)和第三方软件开发工具包(SDK)类型的网络流量。原生类型的网络流量由业务层中的操作系统(例如,安卓和IOS)代码产生和处理。原生类型的网络流量可以完全由目标应用1080的所有者来控制。
第三方SDK类型的网络流量由第三方SDK产生和处理。通常,目标应用1080中可以接入第三方SDK以用于实现登录或分享的功能。第三方SDK类型的网络流量由这些第三方SDK产生和处理。应理解,第三方SDK类型的网络流量通常不能完全由应用的所有者来控制。
网页视图类型的网络流量可以包括由应用的所有者控制的网络流量,例如,应用内置的浏览器通过调用原生应用的代码而产生的网络流量。网页视图类型的网络流量还可以包括由第三方控制的网络流量。例如,由第三方广告商产生和控制的网络流量。
基于网络流量的类型,安全沙盒子系统1090可以采取相应的分析策略,从而更好地管理应用中的用户数据的网络传输。
在框4106,基于分析指示网络流量满足与目标用户对应的数据交换约束,将网络流量发送到由数据交换约束限定的服务器。可以针对不同的目标用户设置不同的数据交换约束。例如,对于敏感等级更高的目标用户,可以设置更严格的数据交换约束。数据交换约束可以限定哪些用户数据可以被传输到哪些服务器。在一些实现中,可以基于目标用户的用户信息或对应的设备信息来确定目标用户对应的数据交换约束。
在一些实现中,安全沙盒子系统1090可以包括针对不同类型的网络流量的多个子模块。例如,用于管理原生类型的网络流量的子模块、用于管理网页视图类型的网络流量的子模块、 以及用于管理第三方SDK类型的网络流量的子模块。这些子模块可以分析相应类型的网络流量,以及限制或拦截不满足数据交换约束的网络流量。下文将参考图4B至图4E来详细描述针对不同类型的网络流量的管理的细节。
图4B示出了根据本公开的一些实施例的针对原生类型的网络流量的分析和限制过程4200的示意图。图4B示出了用于分析和限制原生类型的网络流量的子模块4210。子模块4210可以是安全沙盒子系统1090的一部分,也可以是安全沙盒子系统1090的一种具体实现方式。
如图4B所示,业务逻辑层4220将网络请求下发给底层OS 4230。业务逻辑层4220可以是图1所示的应用业务逻辑1100在网络传输方面的一种具体实现。子模块4210可以作为拦截器在网络层分析和限制网络请求。子模块4210可以通过分析端点、网络请求的参数、或模式(schema)来限制网络请求。例如,可以基于schema是否已被注册来确定是否限制该网络请求。备选地或附加地,可以基于网络请求中所请求的字段是否涉及敏感信息来确定是否限制该网络请求。
在一些实现中,子模块4210可以包括针对安卓的拦截器、针对IOS的拦截器。附加地,子模块4210还可以包括针对C++的拦截器。以此方式,通过在网络层分析和限制网络请求,可以基于网络请求的协议信息来更好地判断该网络请求是否应该被限制。
图4C示出了根据本公开的一些实施例的针对网页视图类型的网络流量的分析和限制过程4300的示意图。图4C示出了用于分析和限制网页视图类型的网络流量的子模块4310。子模块4310可以是安全沙盒子系统1090的一部分,也可以是安全沙盒子系统1090的一种具体实现方式。
子模块4310可以将网页视图类型的网络流量转移到原生的网络接口,以使得网页视图类型的网络流量可以由针对原生类型的网络流量的子模块4210来分析和限制。在一些实现中,子模块4310可以利用JavaScript(JS)的钩子(hook)机制来将网页视图类型的网络流量转移到原生的网络接口。
如图4C所示,子模块4310可以包括启动器4311、导航URL拦截器4312和内部请求拦截器4313。子模块4310可以与应用内置的浏览器4320通信,以使得网页视图类型的网络流量可以被子模块4310管理和检测。启动器4311可以在应用内置的浏览器4320打开(被创建)时进行JS注入,以使得网页视图类型的网络流量可以利用hook机制被转移到原生的网络接口。被转移到原生的网络接口的网络流量可以由原生的网络模块接管。
在一些实现中,可以使用如下方式来利用JS hook技术进行网络流量的转移。
导航URL拦截器4312可以分析和限制主页面(初始页面)的URL。例如,导航URL拦截器4312可以基于URL的schema是否被注册来确定是否限制该网络请求。如果该网络请求未被限制,则浏览器4320可以加载该主页面。
内部请求拦截器4313可以将与主页面的静态资源和动态资源有关的网络流量转接到原生的网络接口,以使得这些网络流量可以由子模块4210来在网络层进行限制和分析。具体的分析和限制过程与原生类型的网络流量类似,在此不再赘述。
在一些实现中,针对由应用的所有者控制的网页视图类型的网络流量和由第三方控制的网页视图类型的网络流量,子模块4310可以采取不同的分析和限制策略。例如,针对由第三方控制的网页视图类型的网络流量,可以仅利用导航URL拦截器4312来确定主页面的URL是否被注册来分析相关的网络流量,而无需进一步分析主页面的静态资源和动态资源。
图4D示出了根据本公开的一些实施例的针对第三方SDK类型的网络流量的分析和限制 过程4400的示意图。图4D示出了用于分析和限制第三方SDK类型的网络流量的子模块4410。子模块4410可以是安全沙盒子系统1090的一部分,也可以是安全沙盒子系统1090的一种具体实现方式。
子模块4410可以在应用程序接口(API)层分析和限制第三方SDK类型的网络流量。子模块4410可以通过在API层分析第三方SDK的API所请求的数据是否满足数据交换约束来限制第三方SDK类型的网络流量。
在一些实现中,子模块4410可以包裹(wrap)第三方SDK中请求用户数据的API,并且在包裹中添加基于数据交换约束的判断逻辑。换言之,子模块4410可以通过向第三方SDK的API添加判断逻辑来确定包裹API。这样,业务逻辑层4220不是直接调用第三方SDK的API,而是调用添加了判断逻辑的包裹API。
如图4D所示,子模块4410可以包括分别针对每个第三方SDK的包裹模块。例如,针对SDK 4411的包裹模块4412、针对SDK 4413的包裹模块4414、以及针对SDK 4415的包裹模块4416。包裹模块(例如,包裹模块4412)可以包裹对应的SDK(例如,SDK 4411)中的API,以生成对应的包裹API。在一些实现中,子模块4410可以动态地增加包裹模块以包裹第三方SDK的API。
在一些实现中,可以通过如下方式来包裹第三方SDK的API。包裹模块4412可以定义暴露给业务层的、与SDK 4411中的API相同的API。包裹模块4412可以实现该API并且定义SDK 4411数据类型的包裹类。
判断逻辑可以基于数据交换约束来确定是否可以调用被包裹的第三方SDK的API。在一些实现中,判断逻辑可以基于SDK的名称、API的名称、API的参数的名称等来分析第三方SDK的API是否可以被调用。如果判断结果为是,则第三方SDK的API可以被调用,并向业务层返回值。如果判断结果为否,则不调用第三方SDK的API,也即,与该API相关的网络流量被限制。应理解,判断逻辑可以基于具体场景而变化。例如,判断逻辑可以设置为不允许向第三方SDK传入用户的隐私数据。
以此方式,通过在API层进行分析和限制,子模块4410可以在无需知晓第三方SDK的内部代码的情况下管理和检测第三方SDK类型的网络流量。
图4E示出了根据本公开的一些实施例的安全沙盒子系统1090的模块图。如图4E所示,安全沙盒子系统1090包括启动模块4520。启动模块4520被配置为基于对目标用户的确定,启动对目标用户的用户数据从客户端应用到服务器的网络传输的检测。启动模块4520可以激活管理模块来检测、管理、分析和限制与用户数据的网络传输对应的网络流量。
管理模块被配置为基于网络传输对应的网络流量的类型,在网络传输的不同层分析网络流量;以及基于分析指示网络流量满足与目标用户对应的数据交换约束,将网络流量发送到由数据交换约束限定的服务器。
在一些实现中,管理模块可以包括子模块(也称为第一管理模块)4210、子模块(也称为第二管理模块)4310和子模块(也称为第三管理模块)4410。子模块4210、子模块4310和子模块4410可以分析和限制客户端应用的网络流量。
在一些实现中,子模块4210被配置为基于网络流量的类型为原生类型的网络流量,在网络层分析网络流量。
在一些实现中,子模块4310被配置为基于网络流量的类型为网页视图类型的网络流量,将网页视图类型的网络流量转移到客户端应用的网络接口以由客户端应用的原生网络模块管 理;以及在网络层分析所转移的网络流量。
在一些实现中,将网页视图类型的网络流量转移到移动端应用的网络接口包括:利用JavaScript的钩子机制来转移网页视图类型的网络流量。
在一些实现中,子模块4410被配置为基于网络流量的类型为第三方SDK类型的网络流量,在应用程序接口API层分析网络流量。
在一些实现中,在API层分析所述网络流量包括:通过向第三方SDK的API添加基于所述数据交换约束的判断逻辑来确定包裹API;以及调用所述包裹API以利用所述判断逻辑来分析所述网络流量。
在一些实现中,启动模块4520可以基于对目标用户的确定来激活子模块4210、4310和4410。例如,启动模块4520可以在用户注册时确定当前用户是否是目标用户。如果确定结果为是,则启动模块4520可以激活子模块4210、4310和4410。又例如,启动模块4520可以在用户登录时从本地或服务器获取对该用户的确定结果,并基于确定结果来确定是否激活子模块4210、4310和4410。
安全沙盒子系统1090中还可以包括用于对网络流量进行采样的采样模块4510。在一些实现中,采样模块4510可以向启动模块4520发送采样信号来触发启动模块4520。采样信号可以指示对网络流量进行采样的采样率。
采样模块4510可以基于数据交换约束来采样目标用户和不同类型的网络流量。例如,采样模块4510可以以不同的采样率来对不同类型的网络流量进行采样。利用采样模块4510,可以仅分析网络流量中的一部分,从而降低开销并维持应用的稳定性。
应理解,安全沙盒子系统1090还可以包括其他模块,或者仅包括图4E所示的部分模块。例如,在目标应用1080仅是移动端的原生应用时,安全沙盒子系统1090可以不包括针对网页视图类型的网络流量的子模块4310。本公开的范围对此不作限制。
在一些实现中,基于网络流量的类型,还可以在套接字(Socket)层来分析和限制网络流量。例如,可以在Socket层转接第三方SDK类型的网络流量,以使得可以直接分析第三方SDK类型的网络请求。备选地或附加地,也可以在Scoket层分析和限制原生类型的网络流量以及网页视图类型的网络流量。
在一些实现中,还可以在目标应用1080上建立作为代理的本地服务器。通过将目标应用1080的网络请求转发给本地服务器,并且通过在本地服务器分析和限制网络流量,可以管理由本地服务器转发给外部服务器的网络请求。以此方式,可以在考虑协议信息的情况下分析和限制不同类型的网络流量,从而更好地管理应用的网络流量不被传输到未经允许的外部服务器。
上文参考图4B至图4E详细描述了针对不同类型的网络流量的分析和限制的原理和细节。应理解,上述限制规则、判断逻辑和数据交换约束仅是示例性的,而非限制本公开的范围。例如,根据不同国家的法律法规要求可以设置不同的数据主权保护规则。此外,取决于计算机网络的层的定义,可以在与上述层相近或相似的层处来分析和限制网络流量。
此外,在上文的描述中,安全沙盒子系统1090可以直接分析和限制目标应用1080中的网络流量。换言之,只有未被安全沙盒子系统1090限制的网络流量才能继续传输。备选地或附加地,安全沙盒子系统1090可以不直接限制网络流量,而是仅提供分析报告。在这种情况下,可以在正常传输网络请求的同时向安全沙盒子系统1090发送网络请求的副本。安全沙盒子系统1090可以分析网络请求的副本,并提供分析报告。
在一些实现中,针对多个数据主权国家的情况,可以相应地设置多个安全沙盒子系统1090来分别针对每个数据主权国家进行处理。例如,可以基于对目标用户所在地区的确定,启动对应的安全沙盒子系统来分析和限制网络流量,以使得应用中的用户数据的网络传输符合对应国家的数据主权保护规则。
推荐管理子系统
如上文所讨论的,目标应用例如可以通过推荐机制来向用户提供各种内容的推荐,诸如,多媒体内容推荐、用户推荐、商品推荐等等。在这样的应用中,推荐策略的公平性已经成为许多地区管理的重点。例如,一些应用可能通过推荐机制来引导用户去关注与用户习惯无关的特定内容,则这样的推荐机制可能是不合规的。
另一方面,通常的推荐算法往往依赖于机器学习模型来实施,诸如由安全计算子系统1060所执行的代码层级核查可能无法有效地检测推荐算法的公平性。而另一方面,推荐模型的训练与更新往往与实际用户数据息息相关,人们也不期望在检查过程中暴露用户的隐私数据,因为这可能会导致数据合规的风险。
本公开的实施例进一步提出了一种管理推荐策略的方案。图5示出了管理推荐策略的过程500的流程图。该过程500例如可以由推荐管理子系统1050执行。
如图5所示,在框502,推荐管理子系统1050获取与目标应用中的一组对象相关联的一组对象特征,其中一组对象特征是基于一组对象的属性而转换得到的,一组对象特征不直接表达一组对象的属性。
在一些实施例中,推荐管理子系统1050可以经由目标应用提供的应用程序接口API来获取该组对象特征。在一些实施例中,推荐管理子系统1050例如可以经由专用API而从目标应用平台1030来获取与目标应用1080中的一组对象相关联的一组对象特征。
在一些实施例中,该组对象特征例如可以是由特征提取模型基于该组对象的属性而转换得到的。基于这样的方式,可以使得推荐策略的管理方或者其他第三方无法基于对象特征来确定对象的原始属性信息。由此,可以保证目标应用中的数据的安全性。
在框504,推荐管理子系统1050从一组对象特征中确定第一对象特征和第二对象特征,其中第一对象特征和第二对象特征之间的第一差异小于第一阈值。
在一些实施例中,该组对象特征例如可以表示为多个向量。进一步地,推荐管理子系统1050例如可以基于向量之间的差异,而从该组对象特征中选择出差异小于第一阈值的至少一对对象特征。
在框506,推荐管理子系统1050基于目标应用中的推荐策略,确定与第一对象特征对应的第一推荐结果和与第二对象特征对应的第二推荐结果。
在一些实施例中,推荐管理子系统1050可以将第一对象特征提供至与推荐策略相关联的推荐模型,以确定第一推荐结果,并且将第二对象特征提供至推荐模型,以确定第二推荐结果。
在一些实施例中,为了保证推荐策略的安全性,可以由推荐管理子系统1050经由目标应用提供的API而将所选择得到的第一对象特征和第二对象特征发送至远程运行的推荐模型,以用于确定第一推荐结果和第二推荐结果。示例性地,该推荐模型例如可以是由目标应用的维护方所运行的。
在一些实施例中,生成第一推荐结果和第二推荐结果的过程将不会影响目标应用中真正部署的推荐模型。
在一些实施例中,第一推荐结果和第二推荐结果可以是由推荐模型输出的向量表示。由此,推荐管理子系统1050将无法直接解读第一推荐结果和第二推荐结果的语义,从而进一步提高了目标应用内数据的安全性。
在框508,推荐管理子系统1050基于第一推荐结果和第二推荐结果,评估推荐策略。
在一些实施例中,推荐管理子系统1050可以确定第一推荐结果和第二推荐结果之间的第二差异,并且基于第二差异与第二阈值之间的比较,确定推荐策略的公平性。
具体地,对于合理的推荐策略而言,对于两个相似的对象,其推荐结果应当是相似的。因此,如果推荐管理子系统1050确定第二差异超过第二阈值时,可以确定推荐策略具有较差的公平性。
或者,推荐管理子系统1050例如也可以基于第二差异超过第二阈值的对象特征对的占比,来确定推荐策略的公平性。例如,推荐管理子系统1050例如可以随机采样多组对象特征对,并如果其中第二差异超过第二阈值的对象特征对的占比超过阈值占比,则可以确定推荐策略具有较差的公平性。
在一些实施例中,推荐管理子系统1050还可以基于用于输入到推荐模型的对象特征和历史推荐结果之间的相关性,来确定推荐策略的公平性。具体地,推荐管理子系统1050还可以从目标应用获取第三对象特征和针对第三对象特征的历史推荐结果。进一步地,推荐管理子系统1050基于第三对象特征和历史推荐结果之间的相关性,确定推荐策略的公平性。例如,推荐管理子系统1050可以基于确定对象特征与历史推荐结果的类别信息是否匹配。
在一些实施例,推荐管理子系统1050可以确定与第三对象特征和历史推荐结果对应的向量表示,并基于两个向量表示之间的差异来确定第三对象特征和历史推荐结果之间的相关性。例如,如果一个对象和其历史推荐结果的向量差异大于阈值,则推荐管理子系统1050可以确定推荐策略具有较差的公平性。
在一些实施例中,如上文所提及的,还可以由安全计算子系统1060例如对于推荐策略相关联的源代码进行检查。具体地,安全计算子系统1060例如可以获取与推荐策略相对应的源代码,并且基于源代码或与源代码对应的中间代码来评估推荐策略。
在一些实施例中,推荐策略例如可以用于向目标应用1080中的用户推荐至少一项多媒体内容。多媒体内容的示例例如可以包括:图像、视频、音乐或其组合等。
示例装置和设备
本公开的实施例还提供了用于实现上述方法或过程的相应装置。图6示出了根据本公开的一些实施例的管理推荐策略的装置600的示意性结构框图。
如图6所示,装置600包括:获取模块610,被配置为获取与目标应用中的一组对象相关联的一组对象特征,一组对象特征是基于一组对象的属性而转换得到的,一组对象特征不直接表达一组对象的属性。
装置600还包括选择模块620,被配置为从一组对象特征中确定第一对象特征和第二对象特征,第一对象特征和第二对象特征之间的第一差异小于第一阈值。
装置600还包括确定模块630,被配置为基于目标应用中的推荐策略,确定与第一对象 特征对应的第一推荐结果和与第二对象特征对应的第二推荐结果。
装置600还包括评估模块640,被配置为基于第一推荐结果和第二推荐结果,评估推荐策略。
在一些实施例中,获取模块610还被配置为:经由目标应用提供的应用程序接口API来获取一组对象特征。
在一些实施例中,确定模块630还被配置为:将第一对象特征提供至与推荐策略相关联的推荐模型,以确定第一推荐结果;以及将第二对象特征提供至推荐模型,以确定第二推荐结果。
在一些实施例中,推荐模型是由目标应用的维护方所运行。
在一些实施例中,第一推荐结果和第二推荐结果是由推荐模型输出的向量表示。
在一些实施例中,评估模块640还被配置为:确定第一推荐结果和第二推荐结果之间的第二差异;以及基于第二差异与第二阈值之间的比较,确定推荐策略的公平性。
在一些实施例中,装置600还包括:历史获取模块,被配置为从目标应用获取第三对象特征和针对第三对象特征的历史推荐结果,其中历史推荐结果包括由推荐模型生成的向量表示;以及比较模块,被配置为基于第三对象特征和历史推荐结果,确定推荐策略的公平性。
在一些实施例中,推荐策略用于向目标应用中的用户推荐至少一项多媒体内容。
在一些实施例中,装置600还包括:代码获取模块,被配置为获取与推荐策略相对应的源代码;以及代码评估模块,被配置为基于源代码或与源代码对应的中间代码,评估推荐策略。
图7示出了可以用来实施本公开内容的实施例的示例设备700的示意性框图。例如,根据本公开实施例的系统100和/或系统400可以由设备700来实施。如图所示,设备700包括中央处理单元(CPU)701,其可以根据存储在只读存储器(ROM)702中的计算机程序指令或者从存储单元708加载到随机访问存储器(RAM)703中的计算机程序指令,来执行各种适当的动作和处理。在RAM 703中,还可存储设备700操作所需的各种程序和数据。CPU 701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。
设备700中的多个部件连接至I/O接口705,包括:输入单元706,例如键盘、鼠标等;输出单元707,例如各种类型的显示器、扬声器等;存储单元708,例如磁盘、光盘等;以及通信单元709,例如网卡、调制解调器、无线通信收发机等。通信单元709允许设备700通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。
上文所描述的各个过程和处理,例如过程500,可由处理单元701执行。例如,在一些实施例中,过程500可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元708。在一些实施例中,计算机程序的部分或者全部可以经由ROM 702和/或通信单元709而被载入和/或安装到设备700上。当计算机程序被加载到RAM 703并由CPU 701执行时,可以执行上文描述的过程500的一个或多个动作。
本公开可以是方法、装置、系统和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于执行本公开的各个方面的计算机可读程序指令。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非 穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理单元,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理单元执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个 模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本公开的各实施方式,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施方式。在不偏离所说明的各实施方式的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施方式的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其他普通技术人员能理解本文披露的各实施方式。

Claims (13)

  1. 一种用于管理目标应用中的推荐策略的方法,包括:
    获取与所述目标应用中的一组对象相关联的一组对象特征,所述一组对象特征是基于所述一组对象的属性而转换得到的,所述一组对象特征不直接表达所述一组对象的所述属性;
    从所述一组对象特征中确定第一对象特征和第二对象特征,所述第一对象特征和所述第二对象特征之间的第一差异小于第一阈值;
    基于所述目标应用中的所述推荐策略,确定与所述第一对象特征对应的第一推荐结果和与所述第二对象特征对应的第二推荐结果;以及
    基于所述第一推荐结果和所述第二推荐结果,评估所述推荐策略。
  2. 根据权利要求1所述的方法,其中获取与所述目标应用中的一组对象相关联的一组对象特征包括:
    经由所述目标应用提供的应用程序接口API来获取所述一组对象特征。
  3. 根据权利要求1所述的方法,其中确定与所述第一对象特征对应的第一推荐结果和与所述第二对象特征对应的第二推荐结果包括:
    将所述第一对象特征提供至与所述推荐策略相关联的推荐模型,以确定所述第一推荐结果;以及
    将所述第二对象特征提供至所述推荐模型,以确定所述第二推荐结果。
  4. 根据权利要求3所述的方法,其中将所述第一对象特征提供至所述推荐模型包括:
    经由所述目标应用提供的API向远程运行的所述推荐模型发送所述第一对象特征。
  5. 根据权利要求3所述的方法,其中所述第一推荐结果和所述第二推荐结果是由所述推荐模型输出的向量表示。
  6. 根据权利要求1所述的方法,其中基于所述第一推荐结果和所述第二推荐结果来评估所述推荐策略包括:
    确定所述第一推荐结果和所述第二推荐结果之间的第二差异;以及
    基于所述第二差异与第二阈值之间的比较,确定所述推荐策略的公平性。
  7. 根据权利要求1所述的方法,还包括:
    从所述目标应用获取第三对象特征和针对所述第三对象特征的历史推荐结果;以及
    基于所述第三对象特征和所述历史推荐结果之间的相关性,确定所述推荐策略的公平性。
  8. 根据权利要求1所述的方法,其中所述推荐策略用于向所述目标应用中的用户推荐至少一项多媒体内容。
  9. 根据权利要求1所述的方法,还包括:
    获取与所述推荐策略相对应的源代码;以及
    基于所述源代码或与所述源代码对应的中间代码,评估所述推荐策略。
  10. 一种用于管理推荐策略的装置,包括:
    获取模块,被配置为获取与目标应用中的一组对象相关联的一组对象特征,所述一组对象特征是基于所述一组对象的属性而转换得到的,所述一组对象特征不直接表达所述一组对象的所述属性;
    选择模块,被配置为从所述一组对象特征中确定第一对象特征和第二对象特征,所述第一对象特征和所述第二对象特征之间的第一差异小于第一阈值;
    确定模块,被配置为基于所述目标应用中的所述推荐策略,确定与所述第一对象特征对应的第一推荐结果和与所述第二对象特征对应的第二推荐结果;以及
    评估模块,被配置为基于所述第一推荐结果和所述第二推荐结果,评估所述推荐策略。
  11. 一种电子设备,包括:
    存储器和处理器;
    其中所述存储器用于存储一条或多条计算机指令,其中所述一条或多条计算机指令被所述处理器执行以实现根据权利要求1至9中任一项所述的方法。
  12. 一种计算机可读存储介质,其上存储有一条或多条计算机指令,其中所述一条或多条计算机指令被处理器执行以实现根据权利要求1至9中任一项所述的方法。
  13. 一种计算机程序产品,包括一条或多条计算机指令,其中所述一条或多条计算机指令被处理器执行以实现根据权利要求1至9中任一项所述的方法。
PCT/CN2022/123907 2021-10-27 2022-10-08 管理推荐策略的方法和装置 WO2023071729A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111256520.5 2021-10-27
CN202111256520.5A CN116055074A (zh) 2021-10-27 2021-10-27 管理推荐策略的方法和装置

Publications (1)

Publication Number Publication Date
WO2023071729A1 true WO2023071729A1 (zh) 2023-05-04

Family

ID=85229746

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/123907 WO2023071729A1 (zh) 2021-10-27 2022-10-08 管理推荐策略的方法和装置

Country Status (3)

Country Link
US (1) US11586773B1 (zh)
CN (1) CN116055074A (zh)
WO (1) WO2023071729A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040243527A1 (en) * 2003-05-28 2004-12-02 Gross John N. Method of testing online recommender system
CN110858327A (zh) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 验证训练数据的方法、训练系统以及计算机程序产品
CN111352833A (zh) * 2020-02-24 2020-06-30 北京百度网讯科技有限公司 推荐系统的测试方法、装置、设备和计算机存储介质
CN111460384A (zh) * 2020-03-31 2020-07-28 北京百度网讯科技有限公司 策略的评估方法、装置和设备

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411754A (zh) * 2011-11-29 2012-04-11 南京大学 一种基于商品属性熵值的个性化推荐方法
JP6014515B2 (ja) * 2013-02-22 2016-10-25 株式会社エヌ・ティ・ティ・データ レコメンド情報提供システム、レコメンド情報生成装置、レコメンド情報提供方法、およびプログラム
JP6744353B2 (ja) * 2017-04-06 2020-08-19 ネイバー コーポレーションNAVER Corporation ディープラーニングを活用した個人化商品推薦
CN109688178B (zh) * 2017-10-19 2022-03-11 阿里巴巴集团控股有限公司 推荐方法、装置和设备
KR20190053675A (ko) * 2017-11-10 2019-05-20 삼성전자주식회사 전자 장치 및 그 동작 방법
US10565475B2 (en) * 2018-04-24 2020-02-18 Accenture Global Solutions Limited Generating a machine learning model for objects based on augmenting the objects with physical properties
US11521020B2 (en) * 2018-10-31 2022-12-06 Equifax Inc. Evaluation of modeling algorithms with continuous outputs
CN111861605A (zh) * 2019-04-28 2020-10-30 阿里巴巴集团控股有限公司 业务对象推荐方法
CN113508378A (zh) * 2019-10-31 2021-10-15 华为技术有限公司 推荐模型的训练方法、推荐方法、装置及计算机可读介质
CN111242752B (zh) * 2020-04-24 2020-08-14 支付宝(杭州)信息技术有限公司 一种基于多任务预测的确定推荐对象的方法及系统
US11620579B2 (en) * 2020-07-09 2023-04-04 Intuit, Inc. Generalized metric for machine learning model evaluation for unsupervised classification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040243527A1 (en) * 2003-05-28 2004-12-02 Gross John N. Method of testing online recommender system
CN110858327A (zh) * 2018-08-24 2020-03-03 宏达国际电子股份有限公司 验证训练数据的方法、训练系统以及计算机程序产品
CN111352833A (zh) * 2020-02-24 2020-06-30 北京百度网讯科技有限公司 推荐系统的测试方法、装置、设备和计算机存储介质
CN111460384A (zh) * 2020-03-31 2020-07-28 北京百度网讯科技有限公司 策略的评估方法、装置和设备

Also Published As

Publication number Publication date
US11586773B1 (en) 2023-02-21
CN116055074A (zh) 2023-05-02

Similar Documents

Publication Publication Date Title
US10812531B2 (en) Metadata-based cloud security
US11303659B2 (en) Detecting inappropriate activity in the presence of unauthenticated API requests using artificial intelligence
WO2023071731A1 (zh) 数据安全保护系统
WO2023071460A1 (zh) 用于数据交换的方法、系统、装置和设备
US9055068B2 (en) Advertisement of conditional policy attachments
US8838951B1 (en) Automated workflow generation
US9015845B2 (en) Transit control for data
CN110839087B (zh) 接口调用方法及装置、电子设备和计算机可读存储介质
US11647052B2 (en) Synthetic request injection to retrieve expired metadata for cloud policy enforcement
WO2023071729A1 (zh) 管理推荐策略的方法和装置
WO2023071722A1 (zh) 代码管理的方法和装置
WO2023071726A1 (zh) 管理客户端应用的网络流量的方法和装置
Nguyen et al. Context-driven policies enforcement for edge-based iot data sharing-as-a-service
US20230199015A1 (en) System and method for contextual misconfiguration detection
Ras Digital Forensic Readiness Architecture for Cloud Computing Systems
Gardikis et al. Updated specifications, design, and architecture for the usable information driven engine
Nguyen et al. Advanced Context-Sensitive Access Management for Edge-Driven IoT Data Sharing as a Service

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885610

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022885610

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022885610

Country of ref document: EP

Effective date: 20240424

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112024008211

Country of ref document: BR