CN111797422A - Data privacy protection query method and device, storage medium and electronic equipment - Google Patents

Data privacy protection query method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN111797422A
CN111797422A CN201910282034.7A CN201910282034A CN111797422A CN 111797422 A CN111797422 A CN 111797422A CN 201910282034 A CN201910282034 A CN 201910282034A CN 111797422 A CN111797422 A CN 111797422A
Authority
CN
China
Prior art keywords
data
information
distributed storage
basic
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910282034.7A
Other languages
Chinese (zh)
Inventor
何明
陈仲铭
徐鑫
刘耀勇
陈岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910282034.7A priority Critical patent/CN111797422A/en
Publication of CN111797422A publication Critical patent/CN111797422A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data privacy protection query method, a device, a storage medium and electronic equipment, wherein the data privacy protection query method comprises the following steps: acquiring first data, second data and third data of basic data, and storing the first data, the second data and the third data in a terminal; acquiring basic data information of a terminal; performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database; when an inquiry instruction of a user is acquired, extracting basic data information from a distributed storage database; and matching the basic data information at the terminal to obtain target basic data. According to the embodiment of the application, the key features of the basic data are extracted and fused in a three-level storage mode, the first data are not transmitted in a cloud mode when the data are operated, the second data and the third data are transmitted in a cloud end mode after being extracted, accordingly, the user privacy data are prevented from being exposed in the cloud end, and the safety of the system data and the safety of the user privacy data are effectively protected in the terminal and the cloud end.

Description

Data privacy protection query method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of electronic technologies, and in particular, to a data privacy protection query method, apparatus, storage medium, and electronic device.
Background
With the development of electronic technology, electronic devices such as smart phones have become more and more intelligent. The electronic device may perform data processing through various algorithmic models to provide various functions to the user. For electronic devices that need to collect large amounts of data, security of system data and security of user privacy data are important.
Disclosure of Invention
The embodiment of the application provides a data privacy protection query method, a data privacy protection query device, a storage medium and electronic equipment, and both the security of system data and the security of user privacy data can be considered at a terminal and a cloud.
The embodiment of the application provides a data privacy protection query method, which is applied to electronic equipment, wherein the data privacy protection query method comprises the following steps:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of a terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting the basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
An embodiment of the present application further provides a data privacy protection query apparatus, including:
the processing module is used for clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
the acquisition module is used for acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data;
the storage module is used for performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, and the distributed storage database comprises a plurality of distributed storage sub-nodes;
the extraction module is used for extracting the basic data information from the distributed storage database according to the query instruction when the query instruction of the user is acquired;
and the matching module is used for matching the basic data information at the terminal to obtain target basic data.
An embodiment of the present application further provides a storage medium, where the storage medium stores a computer program, and when the computer program runs on a computer, the computer program causes the computer to perform the following steps:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of a terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting the basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
An embodiment of the present application further provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the following steps by calling the computer program stored in the memory:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of a terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting the basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic view of an application scenario of a data privacy protection query method according to an embodiment of the present application.
Fig. 2 is a schematic flowchart of a first data privacy protection query method according to an embodiment of the present application.
Fig. 3 is a schematic view of another application scenario of the data privacy protection query method according to the embodiment of the present application.
Fig. 4 is a schematic flowchart of a second data privacy protection query method according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a data privacy protection query device according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of another data privacy protection query device according to an embodiment of the present application.
Fig. 7 is a schematic structural diagram of a data privacy protection query device according to an embodiment of the present application.
Fig. 8 is a schematic diagram of another structure of a data privacy protection query device according to an embodiment of the present application.
Fig. 9 is a schematic structural diagram of a first electronic device according to an embodiment of the present application.
Fig. 10 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present application.
Referring to fig. 1, fig. 1 is a schematic view of an application scenario of a data privacy protection query method provided in an embodiment of the present application. The data privacy protection query method is applied to electronic equipment. A panoramic perception framework is arranged in the electronic equipment. The panoramic perception architecture is an integration of hardware and software for implementing a data privacy protection query method in an electronic device.
The panoramic perception architecture comprises an information perception layer, a data processing layer, a feature extraction layer, a scene modeling layer and an intelligent service layer.
The information perception layer is used for acquiring information of the electronic equipment and/or information in an external environment. The information-perceiving layer may comprise a plurality of sensors. For example, the information sensing layer includes a plurality of sensors such as a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, and a heart rate sensor.
Among other things, a distance sensor may be used to detect a distance between the electronic device and an external object. The magnetic field sensor may be used to detect magnetic field information of the environment in which the electronic device is located. The light sensor can be used for detecting light information of the environment where the electronic equipment is located. The acceleration sensor may be used to detect acceleration data of the electronic device. The fingerprint sensor may be used to collect fingerprint information of a user. The Hall sensor is a magnetic field sensor manufactured according to the Hall effect, and can be used for realizing automatic control of electronic equipment. The location sensor may be used to detect the geographic location where the electronic device is currently located. Gyroscopes may be used to detect angular velocity of an electronic device in various directions. Inertial sensors may be used to detect motion data of an electronic device. The gesture sensor may be used to sense gesture information of the electronic device. A barometer may be used to detect the barometric pressure of the environment in which the electronic device is located. The heart rate sensor may be used to detect heart rate information of the user.
And the data processing layer is used for processing the data acquired by the information perception layer. For example, the data processing layer may perform data cleaning, data integration, data transformation, data reduction, and the like on the data acquired by the information sensing layer.
The data cleaning refers to cleaning a large amount of data acquired by the information sensing layer to remove invalid data and repeated data. The data integration refers to integrating a plurality of single-dimensional data acquired by the information perception layer into a higher or more abstract dimension so as to comprehensively process the data of the plurality of single dimensions. The data transformation refers to performing data type conversion or format conversion on the data acquired by the information sensing layer so that the transformed data can meet the processing requirement. The data reduction means that the data volume is reduced to the maximum extent on the premise of keeping the original appearance of the data as much as possible.
The characteristic extraction layer is used for extracting characteristics of the data processed by the data processing layer so as to extract the characteristics included in the data. The extracted features may reflect the state of the electronic device itself or the state of the user or the environmental state of the environment in which the electronic device is located, etc.
The feature extraction layer may extract features or process the extracted features by a method such as a filtering method, a packing method, or an integration method.
The filtering method is to filter the extracted features to remove redundant feature data. Packaging methods are used to screen the extracted features. The integration method is to integrate a plurality of feature extraction methods together to construct a more efficient and more accurate feature extraction method for extracting features.
The scene modeling layer is used for building a model according to the features extracted by the feature extraction layer, and the obtained model can be used for representing the state of the electronic equipment, the state of a user, the environment state and the like. For example, the scenario modeling layer may construct a key value model, a pattern identification model, a graph model, an entity relation model, an object-oriented model, and the like according to the features extracted by the feature extraction layer.
The intelligent service layer is used for providing intelligent services for the user according to the model constructed by the scene modeling layer. For example, the intelligent service layer can provide basic application services for users, perform system intelligent optimization for electronic equipment, and provide personalized intelligent services for users.
In addition, the panoramic perception architecture can further comprise a plurality of algorithms, each algorithm can be used for analyzing and processing data, and the plurality of algorithms can form an algorithm library. For example, the algorithm library may include algorithms such as a markov algorithm, a hidden dirichlet distribution algorithm, a bayesian classification algorithm, a support vector machine, a K-means clustering algorithm, a K-nearest neighbor algorithm, a conditional random field, a residual error network, a long-short term memory network, a convolutional neural network, and a cyclic neural network.
The embodiment of the application provides a data privacy protection query method, which can be applied to electronic equipment. The electronic device may be a smartphone, a tablet, a gaming device, an AR (Augmented Reality) device, an automobile, a data privacy protection query device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as a watch, glasses, a helmet, an electronic bracelet, an electronic necklace, an electronic garment, or the like.
The data privacy protection query method provided by the embodiment of the application can be used for the architecture of the panoramic perception framework, and relates to a panoramic perception architecture comprising an information perception layer, a data processing layer and a feature extraction layer.
Referring to fig. 2, fig. 2 is a schematic flowchart of a first data privacy protection query method according to an embodiment of the present application. The data privacy protection query method comprises the following steps:
and 110, clustering the basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing the plurality of second data to obtain third data, and storing the first data, the second data and the third data at the terminal.
In this step, the first data may be clustered basic data; the second data may be panoramic information features; the third data may be fused feature data.
The basic data may include operation information of the electronic device, configuration information of the electronic device, user information, current environment information, and the like. Specifically, the basic data may be collected by one or more sensors, or may be collected in real time. For example, the current environmental information and the related information of the electronic device are acquired by at least one of a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, a blood pressure sensor, a pulse sensor, a heart rate sensor, and the like. The current environment information includes body information of the user, such as blood pressure, pulse, heart rate, and the like. The related information of the electronic device includes operation information of the electronic device, configuration information of the electronic device, user information stored in the electronic device, and the like. The user information comprises information of man-machine interaction such as identity information, personal hobbies, browsing records and personal collections of the user. The operation information of the electronic device includes startup time, shutdown time, standby time, memory usage at each time point, main chip usage at each time point, current operation program information, background operation program information, operation duration of each program, download amount of each program, and the like. In some embodiments, the base data may also include behavioral data of the user operating the terminal, sensor data, and terminal system operational data.
After the plurality of basic data are obtained, the basic data can be stored in the first storage module. For example, a plurality of panoramically aware information may be stored in a hard disk. Wherein, a plurality of databases can be arranged, and the basic data thereof is stored in the corresponding database according to the category.
And clustering the basic data to obtain first data. The first data may be the clustered base data. And aggregating the same basic data together to form a data set, thereby obtaining a plurality of data sets of multiple types of basic data. The basic data may be classified according to hardware attributes of the data, such as data related to a main chip, data related to a display screen, data related to a hard disk, data related to a memory, data related to various sensors, and the like. The basic data may also be classified according to the corresponding application, such as data related to system applications, data related to installed applications; the data related to the installed application program can be reclassified according to the specific application program, such as data related to an instant messaging application program, data related to a map application program, data related to a shopping application program, and the like. The basic data are stored in the corresponding databases according to the categories, so that irrelevant data are effectively isolated, and the data can be independently stored. In some embodiments, obtaining a time series index in each corresponding database can also facilitate indexing of the underlying data.
The data format and the data content of each type of first data can be different, for example, wifi connection information in the sensor data is very limited, and wifi information is not stored and recorded when a wifi signal is not connected; in contrast, IMU data is returned every second at hertz, storing up to G data a day. The characteristic extraction of the basic data is carried out on the database, on one hand, the redundant information is reduced, the storage space is saved, and on the other hand, the important meaning in the basic data can be effectively extracted. For example, audio information belongs to time sequence information, and as time increases, data of the audio information continuously increases, so that feature extraction needs to be performed on the data, and the data volume is reduced. Taking the audio information with a bit width of 32 bits and a sampling frequency of 44100 as an example, the data generated in 5 minutes is about 1G, and after feature extraction, the important features of each time window are obtained, at this time, the features can be stored in a vector form, and the data of 1G can be compressed to several hundred k, and the like.
And inquiring the same type of basic data into the same database by means of privacy protection. A basic data item can be stored in a database, for example, the acceleration sensor data item is stored only in the acceleration sensor database. For example, when a certain basic data item belongs to two categories, the basic data item may be copied, and the copied basic data item and the original basic data item may be stored in two databases, which correspond to the two categories to which the basic data item belongs. It should be noted that, the database may store not only the currently acquired panoramic sensing information, but also the previously stored panoramic sensing information.
And performing feature extraction on the first data to obtain second data, wherein the second data can be panoramic information features. And extracting the characteristics of the basic data of the databases to obtain the panoramic information characteristics corresponding to each database, and storing the panoramic information characteristics for the second time. The second data does not need to contain a large amount of original basic data, and only needs to contain corresponding panoramic information characteristics. The first data is subjected to feature extraction, so that the important features of the first data are effectively extracted, the redundant information of the original basic data is reduced, and the storage space is saved. The amount of data of the second data is greatly reduced relative to the original base data and the first data. It should be noted that, the feature extraction of the first data is performed on the database, the extracted second data is stored, and the original data format can be avoided being directly stored, so that the security of the control information is strict, and the privacy of the user is protected. By carrying out characteristic extraction on basic data of the database, desensitization treatment can be carried out on source data, user data subjected to desensitization of a characteristic layer is effectively recorded, data redundancy is reduced, and follow-up use is facilitated.
And performing individual feature extraction on the data in the individual databases to obtain the panoramic information features corresponding to each database. A feature extraction layer can be arranged to extract features of the basic data in a plurality of ways, and different feature extraction methods can be provided corresponding to different data.
In some embodiments, the database is subjected to feature extraction of basic data by a manual preset method, and important features in the basic data of each category are preset. Clustering and storing the basic data into corresponding databases, identifying the same important features of the basic data in the same database, extracting specific data of each basic data corresponding to preset important features, taking the specific data as panoramic information features, and storing the panoramic information features for the second time.
In some embodiments, the feature extraction of the basic data is performed on the database by a method of training a machine learning model in advance. The step of extracting the features of the basic data from the databases to obtain the panoramic information features corresponding to each database may specifically be: pre-training a machine learning model to obtain a machine learning model matched with basic data; and inputting the basic data into the machine learning model, acquiring a model output result, and taking the model output result as the panoramic information characteristic.
And fusing the plurality of second data to obtain third data. The third data may be fused feature data. Specifically, the panoramic information features may be fused in a multi-table connection manner, may be fused in a time sequence alignment manner, and may be fused together in a multi-table connection and time sequence alignment manner. Because most of data on the terminal is time sequence data, namely, the operation of the user at different time points and the scene of the terminal are different and change along with the change of time, the second data is fused, the asymmetry among the data can be further reduced, and the data volume is compressed.
And fusing the panoramic information characteristics to obtain fused characteristic data, storing the fused characteristic data for the third time, and storing the fused characteristic data into a third storage unit. The data is effectively subjected to disaster recovery backup in a cascade storage mode, plaintext data can be prevented from being stored and transmitted, high latitude characteristics are extracted from the basic data through a specific characteristic extraction step (which is equivalent to the encryption operation on the basic data), and the privacy information of a user is effectively protected.
The first data, the second data and the third data are stored in the terminal, and the storage mode may be a triggered data backhaul method, that is, the backhaul mode of the data may be triggered backhaul. For example, for a network module, when the WIFI function is turned on, a nearby available network is searched, data detected by the network module is transmitted to the system, and when the system collects basic data, the system monitors and collects system notification messages.
In some embodiments, the time sequence index corresponding to each database may also be obtained, and the time sequence index corresponding to each database is also stored in the second storage module (e.g., a memory), so that other modules of the system can find the corresponding basic data in the database according to the time sequence index. The clustering method is used for carrying out time series clustering on the multisource heterogeneous basic data, so that the original basic data is effectively compressed, redundant information of the basic data is reduced, and real-time indexing and access of the basic data are realized. The electronic equipment has limited operation resources and storage resources, reasonably accesses and distributes basic data, and can accelerate the retrieval speed of the panoramic perception information.
And 120, acquiring basic data information of the terminal, wherein the basic data information is basic data information of the second data and the third data.
And extracting second data and third data of the terminal, then carrying out cloud-end transmission, taking the intelligent mobile terminal device as a Client, uploading the second data and the third data to a cloud server Cluster, and packaging the data by using a TCP/IP protocol so as to upload streaming data to the server Cluster. Through the server cluster at the cloud, data at all levels can be indexed.
And acquiring basic data information of the terminal at the cloud end, wherein the basic data information is basic data information of the second data and the third data. The basic data information may be the second data and the third data themselves, or may be an index table of the second data and the third data. In other words, the basic data information of the cloud corresponds to the second data and the third data of the terminal, and the second data and the third data of the terminal can be acquired according to the basic data information.
And 130, performing distributed storage on the basic data information at the cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes.
Distributed storage is a data privacy protection query technology, which uses disk space on each machine in an enterprise through a network, and forms a virtual storage device with these distributed storage resources, and data is stored in various corners of the enterprise in a distributed manner. The storage mode of distributed storage can share the storage load, which not only improves the reliability, the availability and the access efficiency of the system, but also is easy to expand.
Distributed storage techniques may include storing files multiple times on multiple machines, which would spread the burden and risk of data privacy protection queries. The more copies a file exists, the less likely it is to be lost. However, more copies means more places can be stolen, thus requiring an encryption system for sensitive data or environments. In some embodiments, the data that is distributed and stored in the cloud is basic data information, the basic data information does not include original basic data, does not include first data obtained by clustering the original data, may include second data obtained by performing feature extraction on the first data, and may include third data obtained by fusing the second data. And carrying out distributed storage on the basic data information at the cloud end to obtain a distributed storage database. Because the storage and processing of the original basic data are not involved in the cloud, the distributed storage method can ensure high-level safety while ensuring the performance.
The traditional distributed storage is essentially a centralized system, data is stored on a plurality of independent devices in a scattered manner, an expandable system structure is adopted, a plurality of storage servers are used for sharing storage load, and a position server is used for positioning storage information. The distributed storage based on the network is a core technology of a block chain, and is a distributed database which is established by inquiring data privacy protection on blocks and through a storage space of an open node, so that the problem of the traditional distributed storage is solved.
The nodes are network nodes in a block chain distributed system, are servers, computers, telephones and the like connected through a network, and have different modes of becoming nodes aiming at block chains with different properties. Taking the bitcoin as an example, a node is formed by participating in transaction or digging a mine.
In the distributed storage database, a plurality of distributed storage child nodes are included. At the cloud, the distributed storage sub-nodes may be divided into a plurality of clusters according to their corresponding data categories, for example, the distributed storage sub-nodes are categorized into a user cluster, a secondary feature cluster, and a tertiary feature cluster. The user cluster contains a large amount of user information, and distributed storage sub-nodes in the user cluster can correspond to the user information of each user, such as unique identification of a terminal and physical addresses of different terminal user data privacy protection inquiries in a server. Retrieval, query and the like can be performed on the secondary feature cluster and the tertiary feature cluster through the user cluster.
The secondary feature cluster and the tertiary feature cluster are basic data information clusters corresponding to the second data and the third data. The basic data information may be the second data and the third data themselves, or may be information that can match the second data and the third data, for example, an index table corresponding to the second data and the third data. The distributed storage sub-nodes in the secondary feature cluster and the tertiary feature cluster can correspond to the basic data information of each user, so that the basic data information can be extracted according to the distributed storage sub-nodes.
And 140, when the query instruction of the user is obtained, extracting the basic data information from the distributed storage database according to the query instruction.
The method comprises the steps of obtaining a query instruction of a user, including directly obtaining the query instruction of the user at a cloud end, and also including transmitting the query instruction to the cloud end after the query instruction of the user is obtained at a terminal. The query instruction west is provided with the unique identification of the user and the data to be queried. When a query instruction of a user is obtained, a distributed storage database corresponding to the user is determined at the cloud according to the unique user identification carried in the query instruction, wherein the distributed storage database comprises distributed storage sub-nodes of basic data information of the user.
In some embodiments, extracting basic data information from the distributed storage database according to the query instruction, including obtaining the query instruction of the user, where the query instruction carries the unique identifier of the user and the data to be queried; determining a target distributed storage database corresponding to the user according to the unique identifier; and determining distributed storage sub-nodes corresponding to the data to be queried in the target distributed storage database, and extracting basic data information according to the distributed storage sub-nodes.
It should be noted that the data to be queried may be the first data, the second data or the third data, the user information data, and the like. Different query methods can be provided for different data to be queried. For example, when the data to be queried is first data, performing feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried; determining distributed storage sub-nodes corresponding to the second data to be queried in a target distributed storage database according to the second data to be queried; and extracting basic data information according to the distributed storage sub-nodes. For another example, when the data to be queried is the second data or the third data, determining a distributed storage child node corresponding to the data to be queried in the target distributed storage database; and extracting basic data information according to the distributed storage sub-nodes. For another example, when the data to be queried is user information data, etc., a distributed storage sub-node corresponding to the user information data to be queried is determined in the distributed storage database, and the user information data is extracted according to the distributed storage sub-node.
Those skilled in the art should appreciate that there may be various methods for extracting the basic data information in the distributed storage database according to the query instruction, and the above is merely an exemplary illustration and should not be construed as a limitation of the present application.
The extracted basic data information corresponding to the user query instruction may include the data to be queried itself, or may include related information that can match the data to be queried. For example, when the query instruction is the first data, the basic data information includes related information that can match the first data to be queried, for example, the basic data information may include corresponding second data, and may also include related information that can match the corresponding second data. When the query instruction is the second data, the basic data information may include the corresponding second data, and may also include related information that can match the corresponding second data.
It should be noted that the data to be queried may be one kind of data, or may be multiple kinds of data, and for example, may include user information data to be queried and second data. The data to be queried may be data of a certain level, or data of multiple levels, for example, the second data and the third data may be queried simultaneously, or the first data and the second data may be queried simultaneously, and so on. In some embodiments, a query instruction of a user is obtained, the query instruction carries a unique identifier of the user and a plurality of data to be queried, after a target distributed storage database corresponding to the user is determined according to the unique identifier, the types of the plurality of data to be queried are determined, and corresponding query methods are respectively executed corresponding to the types of the plurality of data to be queried. And if the plurality of data to be queried belong to the same type, determining the query sequence of the plurality of data to be queried, and querying the data to be queried according to the sequence.
And 150, matching the basic data information at the terminal to obtain target basic data.
If step 140 includes feedback of the query instruction from the cloud, step 150 includes information interfacing between the cloud and the terminal. And after extracting basic data information according to the data to be queried at the cloud end, matching the basic data information with the data stored in the terminal to obtain the data to be queried required by the user.
When the data to be inquired is second data, matching the basic data information with the second data of the terminal to obtain target second data; and when the data to be inquired is third data, matching the basic data information with the second data of the terminal to obtain target third data. When the data to be queried is the first data, the basic data information can be matched with the second data of the terminal to obtain target second data, and the target first data is searched in the terminal according to the target second data. In some embodiments, a matching mapping or a search table and the like corresponding to the basic data information and the first data of the terminal are preset in the cloud, and at this time, the basic data information and the first data of the terminal may also be directly matched to obtain the target first data.
It should be noted that, in order to identify different terminals in the cloud, the cloud itself may store user information data or related data corresponding to the user information data, for example, a unique identifier of a user. In order to distinguish different user terminals, user information data can be extracted when basic data information of the terminal is obtained, and when a user inquires data to be inquired, target data obtained by inquiry and the user information data of the terminal are fed back to the user.
Referring to fig. 3, fig. 3 is a diagram of another application scenario of the data privacy protection query method according to the embodiment of the present application. The user behavior data, the sensor data, …, the system operation data, etc. are sources of basic data, and specifically, the basic data can be obtained through sensors, etc. Then, clustering the plurality of basic data to obtain first data, and storing the first data. The first data includes behavioral data of the user, sensor data, …, system operational data, and other basic data.
And then, the feature extraction module performs feature extraction on the first data, extracts important features of the first data as second data and stores the second data. The second data comprises behavior characteristics, sensor characteristics, …, system characteristics and other panoramic information characteristics of the user.
The third data may be fusion panorama features obtained by fusing panorama information features of the second data, and the fusion panorama features are stored.
It should be noted that the first data, the second data, and the third data are all stored in the terminal, so as to ensure that the original data is not lost.
After the third data is obtained, the second data and the third data can be uploaded to a cloud and provided to a server for data analysis, and the second data and the third data can also be transmitted to an application service layer or a data processing layer for calculation. In addition, the second data and the third data can be subjected to redundant backup, so that the data redundancy is increased, and the data loss is effectively prevented.
In some embodiments, the second data and the third data are extracted to be transmitted to the cloud, a corresponding second-level feature master and a corresponding third-level feature master are respectively arranged at the cloud, a user master is further arranged at the cloud, the user master records a large amount of user information, the user information is conveniently indexed subsequently, and the second-level feature master and the third-level feature master can be retrieved and inquired through the user master. The master can be a server or a server cluster, and the master is mainly responsible for maintaining a physical address table and a logical address table of the storage server cluster and the processing server, and is convenient for storing, processing and distributing the input data distribution server. Each master is responsible for maintaining the index table, but actual data can not be really stored, and the advantage of the design is that the Nanjing information is stored and maintained in a distributed storage mode. The distributed storage has the advantages of supporting high-performance reading and writing, being consistent with a plurality of copies, facilitating disaster recovery and backup of data, elastically expanding server nodes and standardizing a storage system.
When the panoramic perception modeling layer of the server needs to use the secondary feature and the tertiary feature, the request is sent to the user master, and the secondary feature data or the tertiary feature data corresponding to the user can be obtained from a child node (node) of the distributed database.
In some embodiments, the indexing manner for indexing the first data, the second data, the third data, or the like may be a CG-Index manner, and the basic idea is that each node is responsible for maintaining a local B + tree and also maintains a global CG-Index table, and by accessing the CG-Index table, it can be determined on which nodes the local Index needs to be queried, and high-performance random reading and writing can be supported. And the CG-index takes the index as data in another form and stores the table, and a complementary check table is used inside to replace a common table to realize the fault tolerance and recovery of the index, thereby reducing the storage overhead of the index.
Referring to fig. 4, fig. 4 is a schematic flowchart of a second data privacy protection query method according to an embodiment of the present application. The data privacy protection query method comprises the following steps:
210. clustering the basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing the plurality of second data to obtain third data, and storing the first data, the second data and the third data at the terminal.
The basic data includes, but is not limited to, user behavior data, sensor data, system operation data, and the like, and the first data, the second data, and the third data of the basic data are stored in the terminal.
A first-level storage module: and storing the first data as the data in the primary storage database. It should be noted that the database only stores the backup locally in the terminal and does not upload to the server, since it relates to the user core data and the user privacy data.
A secondary storage module: and high-dimensional characteristics in the first data independent database are extracted from the secondary storage, so that data redundancy is effectively reduced, and desensitization processing is performed on source data.
A third-level storage module: the method mainly fuses second data of the independent features, specifically, the method performs table connection operation on a secondary storage database, merges data dimensions with the same time sequence, and fuses panoramic features.
220. And setting unique identifiers for different terminals to distinguish different terminals.
And carrying out cloud transmission on the secondary storage information and the tertiary storage information of the terminal, marking the equipment information of the user, and distinguishing the user from other users. Different users store different data in the cloud, different terminals can be distinguished by setting unique identifiers for the terminals, and the corresponding parts of the users can be accurately found when the data are extracted and matched.
230. And acquiring basic data information of the terminal, wherein the basic data information is basic data information of the second data and the third data.
And acquiring basic data information of the terminal at the cloud end, wherein the basic data information is basic data information of the second data and the third data. The basic data information may be the second data and the third data themselves, or may be an index table of the second data and the third data. In other words, the basic data information of the cloud corresponds to the second data and the third data of the terminal, and the second data and the third data of the terminal can be acquired according to the basic data information.
240. And performing distributed storage on the basic data information at the cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes.
In the distributed storage database, a plurality of distributed storage child nodes are included. At the cloud, the distributed storage sub-nodes may be divided into a plurality of clusters according to their corresponding data categories, for example, the distributed storage sub-nodes are categorized into a user cluster, a secondary feature cluster, and a tertiary feature cluster. The user cluster contains a large amount of user information, and distributed storage sub-nodes in the user cluster can correspond to the user information of each user, such as unique identification of a terminal and physical addresses of different terminal user data privacy protection inquiries in a server. Retrieval, query and the like can be performed on the secondary feature cluster and the tertiary feature cluster through the user cluster.
The secondary feature cluster and the tertiary feature cluster are basic data information clusters corresponding to the second data and the third data. The basic data information may be the second data and the third data themselves, or may be information that can match the second data and the third data, for example, an index table corresponding to the second data and the third data. The distributed storage sub-nodes in the secondary feature cluster and the tertiary feature cluster can correspond to the basic data information of each user, so that the basic data information can be extracted according to the distributed storage sub-nodes.
250. And acquiring a query instruction of a user, wherein the query instruction carries the unique identification of the user and the data to be queried.
The method comprises the steps of obtaining a query instruction of a user, including directly obtaining the query instruction of the user at a cloud end, and also including transmitting the query instruction to the cloud end after the query instruction of the user is obtained at a terminal. The query instruction west is provided with the unique identification of the user and the data to be queried.
260. And determining a target distributed storage database corresponding to the user according to the unique identifier.
When a query instruction of a user is obtained, a distributed storage database corresponding to the user is determined at the cloud according to the unique user identification carried in the query instruction, wherein the distributed storage database comprises distributed storage sub-nodes of basic data information of the user.
270. And judging whether the data to be inquired is first data.
It should be noted that the data to be queried may be the first data, the second data or the third data, the user information data, and the like. Different query methods can be provided for different data to be queried. Since the first data relates to the underlying original data, the first data is not cloud-transmitted in order to protect the privacy of the user. Judging whether the query data is the first data or not is helpful for responding to the query instruction according to different conditions.
Those skilled in the art should appreciate that there may be various methods for extracting the basic data information in the distributed storage database according to the query instruction, and the above is merely an exemplary illustration and should not be construed as a limitation of the present application.
The data to be queried may be one kind of data or may be a plurality of kinds of data, and may include user information data to be queried and second data, for example. The data to be queried may be data of a certain level, or data of multiple levels, for example, the second data and the third data may be queried simultaneously, or the first data and the second data may be queried simultaneously, and so on. In some embodiments, a query instruction of a user is obtained, the query instruction carries a unique identifier of the user and a plurality of data to be queried, after a target distributed storage database corresponding to the user is determined according to the unique identifier, the types of the plurality of data to be queried are determined, and corresponding query methods are respectively executed corresponding to the types of the plurality of data to be queried. If the data to be inquired belong to the same type, determining the inquiry sequence of the data to be inquired, judging whether the data to be inquired is the first data according to the sequence, and executing the inquiry step corresponding to the first data according to the sequence when the data to be inquired is judged to be the first data; and when the data to be inquired is judged not to be the first data, executing the corresponding inquiry steps in sequence when the data to be inquired is not the first data.
281. If not, determining the distributed storage sub-nodes corresponding to the data to be queried in the target distributed storage database.
282. And matching the basic data information with data stored in the terminal to obtain target data.
When the data to be queried is second data or third data, determining distributed storage sub-nodes corresponding to the second data or the third data in the target distributed storage database; and extracting basic data information according to the distributed storage sub-nodes. For another example, when the data to be queried is user information data, etc., a distributed storage sub-node corresponding to the user information data to be queried is determined in the distributed storage database, and the user information data is extracted according to the distributed storage sub-node.
291. And if so, performing feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried.
Because the first data and the panoramic information features corresponding to the first data are not stored in the cloud, the first data to be queried can be subjected to feature extraction to obtain second data to be queried corresponding to the first data to be queried, the second data can be queried, and the corresponding first data can be found according to the second data.
292. And determining distributed storage sub-nodes corresponding to the second data to be queried in the target distributed storage database according to the second data to be queried, and extracting basic data information according to the distributed storage sub-nodes.
It should be noted that, third data to be queried may also be obtained according to the second data to be queried fused with the panoramic features, in step 292, according to the third data to be queried, a distributed storage sub-node corresponding to the three-level data to be queried is determined in the target distributed storage database, and basic data information is extracted according to the distributed storage sub-node.
293. And matching the basic data information with the second data of the terminal to obtain target second data.
294. And searching the target first data in the terminal according to the target second data.
When the data to be inquired is second data, matching the basic data information with the second data of the terminal to obtain target second data; and when the data to be inquired is third data, matching the basic data information with the second data of the terminal to obtain target third data. When the data to be queried is the first data, the basic data information can be matched with the second data of the terminal to obtain target second data, and the target first data is searched in the terminal according to the target second data. In some embodiments, a matching mapping or a search table and the like corresponding to the basic data information and the first data of the terminal are preset in the cloud, and at this time, the basic data information and the first data of the terminal may also be directly matched to obtain the target first data.
In some embodiments, the data privacy protection query method may specifically include: the method comprises the steps of firstly obtaining information of electronic equipment of a user through an information perception layer (specifically comprising electronic equipment operation information, user behavior information, information obtained by various sensors, electronic equipment state information, electronic equipment display content information, electronic equipment uploading and downloading information and the like), then processing the information of the electronic equipment through a data processing layer (for example, classification and the like), then extracting second data and third data from the information processed by the data processing layer through a characteristic extraction layer (the second data and the third data can be specifically referred to the description of the embodiment), then processing the second data and the third data, and uploading target model parameters to a server for storage through a storage module. The server does not receive the first data based on the protection data, and when the first data is required to be inquired, the plaintext data can be prevented from being operated by inquiring the related data of the second data and the third data in the server and then corresponding to the first data.
As can be seen from the above, in the data privacy protection query method provided in the embodiment of the present application, first, basic data is clustered to obtain first data, the first data is subjected to feature extraction to obtain second data, a plurality of second data are fused to obtain third data, and the first data, the second data, and the third data are stored in a terminal; then acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data; then, performing distributed storage on the basic data information at the cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes; then when a query instruction of a user is obtained, extracting basic data information from the distributed storage database according to the query instruction; and finally, matching the basic data information at the terminal to obtain target basic data. By means of three-level storage, key features of basic data are extracted and fused, and when data are operated, plaintext data are prevented from being directly operated, so that the data security of a terminal system and the security of user privacy data are effectively protected. The first data are not transmitted in a cloud, and the second data and the third data are extracted and transmitted in a cloud end, so that the user privacy data are prevented from being exposed in the cloud, and the safety of the cloud system data and the safety of the user privacy data are further protected.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a data privacy protection query device according to an embodiment of the present application. The data privacy protection query device 300 may be integrated in an electronic device, and the data privacy protection query device 300 includes a processing module 301, an obtaining module 302, a storing module 303, an extracting module 304, and a matching module 305.
The processing module 301 is configured to cluster basic data to obtain first data, perform feature extraction on the first data to obtain second data, fuse a plurality of second data to obtain third data, and store the first data, the second data, and the third data at a terminal;
an obtaining module 302, configured to obtain basic data information of a terminal, where the basic data information is basic data information of second data and third data;
the storage module 303 is configured to perform distributed storage on the basic data information at a cloud to obtain a distributed storage database, where the distributed storage database includes a plurality of distributed storage sub-nodes;
the extracting module 304 is configured to, when a query instruction of a user is obtained, extract basic data information from the distributed storage database according to the query instruction;
and the matching module 305 is configured to match the basic data information at the terminal to obtain target basic data.
Referring to fig. 6, fig. 6 is a schematic view of another structure of the data privacy protection query device according to the embodiment of the present application. In some embodiments, the first data, the second data, and the third data in the processing module 301 are stored in the terminal, and the processing module 301 may include a first storage unit 3011, a second storage unit 3012, and a third storage unit 3013.
The first storage unit 3011 is configured to cluster the basic data to obtain first data, store the first data for the first time, and store the first data in a corresponding database;
the second storage unit 3012 is configured to perform feature extraction on the basic data of the databases to obtain a panoramic information feature corresponding to each database, use the panoramic information feature as second data, and perform second storage on the second data;
in some embodiments, the second storage unit is further configured to:
pre-training a machine learning model to obtain a machine learning model matched with the first data;
and inputting the first data into the machine learning model, acquiring a model output result, and taking the model output result as second data.
And a third storage unit 3013, configured to fuse the panoramic information features to obtain fused feature data, and store the fused feature data as third data for a third time.
In some embodiments, the third storage unit 3013 may be specifically configured to fuse the multiple second data in a multi-table connection and/or a time alignment manner to obtain third data, and store the third data for a third time.
Referring to fig. 7, fig. 7 is a schematic view of another structure of the data privacy protection query device according to the embodiment of the present application. In some embodiments, the extraction module 304 may include a query unit 3041, a determination unit 3042, and an extraction unit 3043.
The query unit 3041 is configured to obtain a query instruction of a user, where the query instruction carries a unique identifier of the user and data to be queried;
the determining unit 3042 is configured to determine, according to the unique identifier, a target distributed storage database corresponding to the user;
the extracting unit 3043 is configured to determine a distributed storage sub-node corresponding to the data to be queried in the target distributed storage database, and extract basic data information according to the distributed storage sub-node.
When the data to be queried is the first data, the extracting unit 3043 performs steps including: performing feature extraction on first data to be queried to obtain second data to be queried corresponding to the first data to be queried; and determining a distributed storage sub-node corresponding to the second data to be queried in the target distributed storage database according to the second data to be queried.
In some embodiments, the extracting module 304 may further include an obtaining unit 3044 configured to obtain a data level of the data to be queried, where the data level includes at least the first data, the second data, and the third data.
Referring to fig. 8, fig. 8 is a schematic view of another structure of the data privacy protection query device according to the embodiment of the present application. In some embodiments, the matching module 305 may include a first matching unit 3051, a second matching unit 3052, and a third matching unit 3053.
The first matching unit 3051 is configured to, when data to be queried is first data, match the target basic data with second data of the terminal to obtain target second data; and searching the target first data in the terminal according to the target second data.
The second matching unit 3052 is configured to, when the data to be queried is second data, match the basic data information with the second data of the terminal to obtain target second data.
And the third matching unit 3053 is configured to, when the data to be queried is third data, match the basic data information with the second data of the terminal to obtain target third data.
In some embodiments, the apparatus may further include a backup module, a transmission module. The backup module is used for carrying out real-time backup on the fusion characteristic data at the terminal. The transmission module is used for transmitting the fusion characteristic data to the application service layer or the data processing layer so that the application service layer or the data processing layer can calculate by utilizing the fusion information characteristics.
As can be seen from the above, an embodiment of the present application provides a data privacy protection query apparatus, where a processing module 301 first clusters basic data to obtain first data, performs feature extraction on the first data to obtain second data, fuses a plurality of second data to obtain third data, and stores the first data, the second data, and the third data in a terminal; then, the obtaining module 302 obtains basic data information of the terminal, where the basic data information is basic data information of the second data and the third data; then, the storage module 303 performs distributed storage on the basic data information at the cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes; subsequently, when a query instruction of a user is obtained, the extraction module 304 extracts basic data information from the distributed storage database according to the query instruction; and finally, the matching module 305 matches the basic data information at the terminal to obtain the target basic data. By means of three-level storage, key features of basic data are extracted and fused, and when data are operated, plaintext data are prevented from being directly operated, so that the data security of a terminal system and the security of user privacy data are effectively protected. The first data are not transmitted in a cloud, and the second data and the third data are extracted and transmitted in a cloud end, so that the user privacy data are prevented from being exposed in the cloud, and the safety of the cloud system data and the safety of the user privacy data are further protected.
The embodiment of the application also provides the electronic equipment. The electronic device may be a smartphone, a tablet, a gaming device, an AR (Augmented Reality) device, an automobile, a data privacy protection query device, an audio playback device, a video playback device, a notebook, a desktop computing device, a wearable device such as a watch, glasses, a helmet, an electronic bracelet, an electronic necklace, an electronic garment, or the like.
Referring to fig. 9, fig. 9 is a schematic view of a first structure of an electronic device 900 according to an embodiment of the present disclosure. Electronic device 900 includes, among other things, a processor 901 and memory 902. The processor 901 is electrically connected to the memory 902.
The processor 901 is a control center of the electronic device 900, connects various parts of the whole electronic device by using various interfaces and lines, and performs various functions of the electronic device and processes data by running or calling a computer program stored in the memory 902 and calling data stored in the memory 902, thereby performing overall monitoring of the electronic device.
In this embodiment, the processor 901 in the electronic device 900 loads instructions corresponding to one or more processes of the computer program into the memory 902 according to the following steps, and the processor 901 runs the computer program stored in the memory 902, so as to implement various functions:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
In some embodiments, when performing feature extraction on the first data to obtain second data, the processor 901 performs the following steps:
pre-training a machine learning model to obtain a machine learning model matched with the first data;
and inputting the first data into the machine learning model, acquiring a model output result, and taking the model output result as second data.
In some embodiments, when the plurality of second data are fused to obtain third data, the processor 901 performs the following steps:
and fusing the plurality of second data in a multi-table connection and/or time sequence alignment mode to obtain third data.
In some embodiments, when acquiring a query instruction of a user, and extracting basic data information in a distributed storage database according to the query instruction, the processor 901 performs the following steps:
acquiring a query instruction of a user, wherein the query instruction carries a unique identifier of the user and data to be queried;
determining a target distributed storage database corresponding to the user according to the unique identifier;
and determining distributed storage sub-nodes corresponding to the data to be queried in the target distributed storage database, and extracting basic data information according to the distributed storage sub-nodes.
In some embodiments, before determining the distributed storage child node corresponding to the data to be queried in the target distributed storage database, the processor 901 performs the following steps:
and acquiring the data level of the data to be queried, wherein the data level comprises first data, second data and third data.
In some embodiments, when the data to be queried is the second data or the third data, the processor 901 performs the following steps:
determining distributed storage sub-nodes corresponding to the data to be queried in a target distributed storage database;
when the data to be inquired is second data, matching the basic data information with the second data of the terminal to obtain target second data;
and when the data to be inquired is third data, matching the basic data information with the second data of the terminal to obtain target third data.
In some embodiments, when the data to be queried is the first data, the processor 901 performs the following steps:
performing feature extraction on first data to be queried to obtain second data to be queried corresponding to the first data to be queried;
and determining a distributed storage sub-node corresponding to the second data to be queried in the target distributed storage database according to the second data to be queried.
In some embodiments, when the data to be queried is the first data, and the basic data information is matched at the terminal to obtain the target basic data, the processor 901 performs the following steps:
matching the target basic data with second data of the terminal to obtain target second data;
and searching the target first data in the terminal according to the target second data.
In some embodiments, referring to fig. 10, fig. 10 is a schematic diagram of a second structure of an electronic device 900 according to an embodiment of the present disclosure.
Wherein, the electronic device 900 further comprises: a display 903, a control circuit 904, an input unit 905, a sensor 906, and a power supply 907. The processor 901 is electrically connected to the display 903, the control circuit 904, the input unit 905, the sensor 906, and the power source 907.
The display 903 may be used to display information entered by or provided to the user as well as various graphical user interfaces of the electronic device, which may be comprised of images, text, icons, video, and any combination thereof.
The control circuit 904 is electrically connected to the display 903, and is configured to control the display 903 to display information.
The input unit 905 may be used to receive input numbers, character information, or user characteristic information (e.g., fingerprint), and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control. The input unit 905 may include a fingerprint recognition module.
The sensor 906 is used to collect information of the electronic device itself or information of the user or external environment information. For example, the sensors 906 may include a plurality of sensors such as a distance sensor, a magnetic field sensor, a light sensor, an acceleration sensor, a fingerprint sensor, a hall sensor, a position sensor, a gyroscope, an inertial sensor, an attitude sensor, a barometer, a heart rate sensor, and the like.
Power supply 907 is used to power the various components of electronic device 900. In some embodiments, power supply 907 may be logically coupled to processor 901 via a power management system, such that functions of managing charging, discharging, and power consumption are performed via the power management system.
Although not shown in fig. 10, the electronic device 900 may further include a camera, a bluetooth module, etc., which are not described in detail herein.
As can be seen from the above, an embodiment of the present application provides an electronic device, where a processor in the electronic device performs the following steps: firstly, clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal; then acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data; then, performing distributed storage on the basic data information at the cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes; then when a query instruction of a user is obtained, extracting basic data information from the distributed storage database according to the query instruction; and finally, matching the basic data information at the terminal to obtain target basic data. By means of three-level storage, key features of basic data are extracted and fused, and when data are operated, plaintext data are prevented from being directly operated, so that the data security of a terminal system and the security of user privacy data are effectively protected. The first data are not transmitted in a cloud, and the second data and the third data are extracted and transmitted in a cloud end, so that the user privacy data are prevented from being exposed in the cloud, and the safety of the cloud system data and the safety of the user privacy data are further protected.
The embodiment of the present application further provides a storage medium, in which a computer program is stored, and when the computer program runs on a computer, the computer executes the data privacy protection query method according to any one of the above embodiments.
For example, in some embodiments, when the computer program is run on a computer, the computer performs the steps of:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
It should be noted that, those skilled in the art can understand that all or part of the steps in the methods of the above embodiments can be implemented by the relevant hardware instructed by the computer program, and the computer program can be stored in the computer readable storage medium, which can include but is not limited to: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The data privacy protection query method, the data privacy protection query device, the storage medium and the electronic device provided by the embodiment of the application are described in detail above. The principle and the implementation of the present application are explained herein by applying specific examples, and the above description of the embodiments is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (13)

1. A data privacy protection query method comprises the following steps:
clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
acquiring basic data information of a terminal, wherein the basic data information is basic data information of second data and third data;
performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, wherein the distributed storage database comprises a plurality of distributed storage sub-nodes;
when a query instruction of a user is acquired, extracting the basic data information from a distributed storage database according to the query instruction;
and matching the basic data information at the terminal to obtain target basic data.
2. The data privacy protection query method according to claim 1, wherein before the obtaining of the basic data information of the terminal, the method further comprises:
and setting unique identifiers for different terminals to distinguish different terminals.
3. The data privacy protection query method according to claim 2, wherein when the query instruction of the user is obtained, the extracting the basic data information in the distributed storage database according to the query instruction includes:
acquiring a query instruction of a user, wherein the query instruction carries a unique identifier of the user and data to be queried;
determining a target distributed storage database corresponding to the user according to the unique identifier;
and determining distributed storage sub-nodes corresponding to the data to be queried in the target distributed storage database, and extracting the basic data information according to the distributed storage sub-nodes.
4. The data privacy protection query method according to claim 3, wherein before determining the distributed storage child node corresponding to the data to be queried in the target distributed storage database, the method further includes:
and acquiring the data level of the data to be queried, wherein the data level comprises first data, second data and third data.
5. The data privacy protection query method according to claim 4, wherein the determining, in the target distributed storage database, the distributed storage child node corresponding to the data to be queried includes:
when the data to be queried is second data or third data, determining a distributed storage sub-node corresponding to the data to be queried in the target distributed storage database;
the matching the basic data information at the terminal to obtain the target basic data comprises:
when the data to be inquired is second data, matching the basic data information with the second data of the terminal to obtain target second data;
and when the data to be inquired is third data, matching the basic data information with second data of the terminal to obtain target third data.
6. The data privacy protection query method of claim 4, wherein the determining, in the target distributed storage database, the distributed storage child node corresponding to the query instruction comprises:
when the data to be queried is first data, performing feature extraction on the first data to be queried to obtain second data to be queried corresponding to the first data to be queried;
and determining a distributed storage sub-node corresponding to the second data to be queried in the target distributed storage database according to the second data to be queried.
7. The data privacy protection query method according to claim 6, wherein the matching the basic data information at a terminal to obtain target basic data comprises:
matching the target basic data with second data of the terminal to obtain target second data;
and searching out the target first data in the terminal according to the target second data.
8. The data privacy protection query method of claim 1, wherein the basic data at least comprises behavior data of a user operation terminal, sensor data and terminal system operation data.
9. The data privacy protection query method of claim 1, wherein the performing feature extraction on the first data to obtain second data comprises:
pre-training a machine learning model to obtain a machine learning model matched with the first data;
and inputting the first data into the machine learning model, obtaining a model output result, and taking the model output result as the second data.
10. The data privacy protection query method according to claim 1, wherein the fusing the plurality of second data to obtain third data includes:
and fusing the plurality of second data in a multi-table connection and/or time sequence alignment mode to obtain third data.
11. A data privacy protection query device, wherein the data privacy protection query device comprises:
the processing module is used for clustering basic data to obtain first data, performing feature extraction on the first data to obtain second data, fusing a plurality of second data to obtain third data, and storing the first data, the second data and the third data at a terminal;
the acquisition module is used for acquiring basic data information of the terminal, wherein the basic data information is basic data information of second data and third data;
the storage module is used for performing distributed storage on the basic data information at a cloud end to obtain a distributed storage database, and the distributed storage database comprises a plurality of distributed storage sub-nodes;
the extraction module is used for extracting the basic data information from the distributed storage database according to the query instruction when the query instruction of the user is acquired;
and the matching module is used for matching the basic data information at the terminal to obtain target basic data.
12. A storage medium having stored thereon a computer program, characterized in that, when the computer program runs on a computer, it causes the computer to execute the data privacy protection query method according to any one of claims 1 to 10.
13. An electronic device, characterized in that the electronic device comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for executing the data privacy protection query method according to any one of claims 1 to 10 by calling the computer program stored in the memory.
CN201910282034.7A 2019-04-09 2019-04-09 Data privacy protection query method and device, storage medium and electronic equipment Pending CN111797422A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910282034.7A CN111797422A (en) 2019-04-09 2019-04-09 Data privacy protection query method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910282034.7A CN111797422A (en) 2019-04-09 2019-04-09 Data privacy protection query method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111797422A true CN111797422A (en) 2020-10-20

Family

ID=72805739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910282034.7A Pending CN111797422A (en) 2019-04-09 2019-04-09 Data privacy protection query method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111797422A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112417017A (en) * 2020-11-19 2021-02-26 郑州轻工业大学 Cyclic filtering processing fusion system for heterogeneous data
CN116305297A (en) * 2023-05-22 2023-06-23 天云融创数据科技(北京)有限公司 Data analysis method and system for distributed database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882923A (en) * 2012-07-25 2013-01-16 北京亿赛通科技发展有限责任公司 Secure storage system and method for mobile terminal
CN102906751A (en) * 2012-07-25 2013-01-30 华为技术有限公司 Method and device for data storage and data query
CN103607393A (en) * 2013-11-21 2014-02-26 浪潮电子信息产业股份有限公司 Data safety protection method based on data partitioning
US20140108415A1 (en) * 2012-10-17 2014-04-17 Brian J. Bulkowski Method and system of mapreduce implementations on indexed datasets in a distributed database environment
CN109583224A (en) * 2018-10-16 2019-04-05 阿里巴巴集团控股有限公司 A kind of privacy of user data processing method, device, equipment and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102882923A (en) * 2012-07-25 2013-01-16 北京亿赛通科技发展有限责任公司 Secure storage system and method for mobile terminal
CN102906751A (en) * 2012-07-25 2013-01-30 华为技术有限公司 Method and device for data storage and data query
US20140108415A1 (en) * 2012-10-17 2014-04-17 Brian J. Bulkowski Method and system of mapreduce implementations on indexed datasets in a distributed database environment
CN103607393A (en) * 2013-11-21 2014-02-26 浪潮电子信息产业股份有限公司 Data safety protection method based on data partitioning
CN109583224A (en) * 2018-10-16 2019-04-05 阿里巴巴集团控股有限公司 A kind of privacy of user data processing method, device, equipment and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112417017A (en) * 2020-11-19 2021-02-26 郑州轻工业大学 Cyclic filtering processing fusion system for heterogeneous data
CN116305297A (en) * 2023-05-22 2023-06-23 天云融创数据科技(北京)有限公司 Data analysis method and system for distributed database
CN116305297B (en) * 2023-05-22 2023-09-15 天云融创数据科技(北京)有限公司 Data analysis method and system for distributed database

Similar Documents

Publication Publication Date Title
JP7201730B2 (en) Intention recommendation method, device, equipment and storage medium
US11971898B2 (en) Method and system for implementing machine learning classifications
US11893703B1 (en) Precise manipulation of virtual object position in an extended reality environment
US11645471B1 (en) Determining a relationship recommendation for a natural language request
US11822597B2 (en) Geofence-based object identification in an extended reality environment
US11670288B1 (en) Generating predicted follow-on requests to a natural language request received by a natural language processing system
KR102613774B1 (en) Systems and methods for extracting and sharing application-related user data
JP6303023B2 (en) Temporary eventing system and method
US10725981B1 (en) Analyzing big data
US11288319B1 (en) Generating trending natural language request recommendations
US11410403B1 (en) Precise scaling of virtual objects in an extended reality environment
US11145123B1 (en) Generating extended reality overlays in an industrial environment
US9361320B1 (en) Modeling big data
CN100478944C (en) Automatic task generator method and system
CN109964216A (en) Identify unknown data object
US11657582B1 (en) Precise plane detection and placement of virtual objects in an augmented reality environment
US11790649B1 (en) External asset database management in an extended reality environment
US11100141B2 (en) Monitoring organization-wide state and classification of data stored in disparate data sources of an organization
US11676345B1 (en) Automated adaptive workflows in an extended reality environment
CN111797422A (en) Data privacy protection query method and device, storage medium and electronic equipment
WO2020207252A1 (en) Data storage method and device, storage medium, and electronic apparatus
US20180293299A1 (en) Query processing
CN115376192A (en) User abnormal behavior determination method and device, computer equipment and storage medium
CN115203172A (en) Model construction method, model data subscription method, model construction device, model data subscription device, electronic equipment and medium
Castelli et al. Engineering contextual knowledge for autonomic pervasive services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination