WO2020164272A1 - 上网设备的识别方法、装置及存储介质、计算机设备 - Google Patents

上网设备的识别方法、装置及存储介质、计算机设备 Download PDF

Info

Publication number
WO2020164272A1
WO2020164272A1 PCT/CN2019/117866 CN2019117866W WO2020164272A1 WO 2020164272 A1 WO2020164272 A1 WO 2020164272A1 CN 2019117866 W CN2019117866 W CN 2019117866W WO 2020164272 A1 WO2020164272 A1 WO 2020164272A1
Authority
WO
WIPO (PCT)
Prior art keywords
browser
value
feature
weight
internet
Prior art date
Application number
PCT/CN2019/117866
Other languages
English (en)
French (fr)
Inventor
黎立桂
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164272A1 publication Critical patent/WO2020164272A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods

Definitions

  • This application relates to the technical field of Internet-based equipment research. Specifically, this application relates to a method, device, storage medium, and computer equipment for identifying Internet-based equipment.
  • cookies As the Internet attaches great importance to personal privacy, cookies are becoming more and more unwelcome. Many security tools and even browsers have begun to allow or guide users to turn off the cookie function, which leads to little effect when using cookies .
  • This application proposes a method, device, storage medium, and computer equipment for identifying online equipment, so as to accurately identify the online equipment in the user terminal, so as to achieve accurate recording and tracking of the online equipment.
  • a method for identifying an online device includes: acquiring a browser feature of the online device; acquiring a hash value and a corresponding weight of the browser feature; performing simhash calculation according to the hash value and the corresponding weight to obtain the The browser feature value string corresponding to the browser feature; compare the browser feature value string with system historical data to identify the Internet device; wherein, the system historical data includes the values corresponding to the browser features of multiple devices string.
  • An identification device for Internet access equipment including: a first acquisition module for acquiring browser features of the Internet access equipment; a second acquisition module for acquiring hash values and corresponding weights of the browser features; a calculation module, It is used to perform simhash calculation according to the hash value and the corresponding weight to obtain the browser characteristic value string corresponding to the browser characteristic; the recognition module is used to compare the browser characteristic value string with system historical data, Identify the online device; wherein, the system historical data includes a number string corresponding to the browser characteristics of multiple devices.
  • a computer device includes: one or more processors; a memory; one or more application programs, wherein the one or more application programs are stored in the memory and configured to be operated by the one or more And executed by a processor, and the one or more application programs are configured to execute the method for identifying an Internet device according to any one of the foregoing embodiments.
  • the method for identifying the Internet device obtaineds the weight corresponding to the browser feature of the Internet device and the hash value of the browser feature, and performs simhash calculation based on the two to de-duplicate the browser feature through the simhash algorithm to obtain The corresponding browser characteristic value string. Finally, according to the comparison result of the browser characteristic value string and the system historical data, the Internet device is identified.
  • the above method adopts the characteristics of the browser corresponding to the Internet device and calculates by simhash to accurately identify the Internet device, so as to realize the accurate recording and tracking of the Internet device.
  • FIG. 1 is a schematic diagram of interaction between a server and an Internet device provided by this application in an embodiment
  • FIG. 2 is a method flowchart in an embodiment of a method for identifying Internet devices provided by this application;
  • FIG. 3 is a method flowchart in another embodiment of a method for identifying an Internet device provided by this application;
  • FIG. 4 is a schematic flowchart of a specific implementation of a method for identifying online devices provided by this application;
  • FIG. 5 is a flowchart of a method in an embodiment of step S400 provided by this application.
  • FIG. 6 is a structural block diagram in an embodiment of an apparatus for identifying Internet equipment provided by this application.
  • FIG. 7 is a schematic structural diagram in an embodiment of a computer device provided by this application.
  • the method for identifying Internet equipment provided by this application is applicable to various server systems that provide a browser to the front end and identify the attributes of the Internet equipment on the user side through the browser.
  • the method for identifying Internet access devices is applied in the application environment shown in FIG. 1.
  • the server 100 and the Internet access device 300 on the user side are located in the same network 200 environment, and the server 100 and the Internet access device 300 on the user side exchange data and information through the network 200.
  • the Internet access device 300 on the user side performs network communication with the server 100.
  • the number of the server 100 and the user terminal 300 is not limited, and what is shown in FIG. 1 is only for illustration.
  • a browser client is installed in the Internet access device 300 on the user side for network access to the server 100.
  • the user can interact with the corresponding server 100 through the browser client in the Internet access device 300.
  • the browser client and the server correspond to each other and follow the same data protocol, so that the server and the browser client can parse each other's data.
  • the server 100 may be, but is not limited to, a web server, a management server, an application server, a database server, a cloud server, and so on.
  • the Internet access device 300 at the user end may be, but is not limited to, a smart phone, a personal computer (PC), a tablet computer, a personal digital assistant (PDA), a mobile Internet device (MID), etc.
  • the operating system of the Internet access device 300 at the user end may be, but is not limited to, an Android system, an IOS (IPhone operating system) system, a Windows phone system, a Windows system, etc.
  • the method for identifying an Internet-connected device includes the following steps:
  • the server obtains multiple browser features of the Internet device.
  • Browser features can include the available screen resolution of the device, the actual screen resolution of the device, the number of touchable points on the device, audio stack fingerprints, the total number of logical processors available to the user agent by the system, browser plug-ins, and browser installation fonts , Canvas, WebGL, etc.
  • the server accesses the browser API through JavaScript to obtain multiple browser features.
  • the multiple browser features include the characteristics of the feedback browser itself, as well as the characteristics of the software and hardware of the operating system. In a specific implementation, at least 50 browser features of the Internet-connected device can be obtained.
  • the browser characteristics of the Internet-connected device include the available screen resolution of the device, the actual screen resolution of the device, the number of touch points of the device, the fingerprint of the audio stack, and the logical processor available to the user agent by the system. One or more of total number, browser plug-in, browser installed font, Canvas, and WebGL.
  • the actual screen resolution of the device refers to the width and height of the device screen in pixels, such as 1920*1080.
  • the audio stack fingerprint is a string composed of 7 parts, such as 48000_2_1_0_2_explicit_speakers. among them,
  • the first digit sampling frequency.
  • the third bit Enter the number of bytes.
  • the fourth bit the number of output bytes.
  • the seventh bit represents an enumeration value describing the meaning of the channel. It will define how audio upmixing and downmixing will occur.
  • the server hashes the acquired browser features to obtain the corresponding hash value.
  • determine the weight of each browser feature is determined according to data statistics, such as the stability of the browser feature (how many missing values) and the entropy of the browser feature.
  • the browser feature of the Internet access device includes the device feature of the Internet access device acquired through the browser and the client side feature of the browser.
  • step S100 includes: sending an SDK package to the Internet device through the browser, and using the SDK package to obtain the device characteristics of the Internet device; and obtaining the client by reading the client data of the browser End characteristics.
  • step S200 includes: obtaining the hash value of the device feature and the weight of the device feature; obtaining the hash value of the client feature and the weight of the client feature; and comparing the hash value of the device feature with The weight of the device feature, the hash value of the client feature and the weight of the client feature are used as the hash value of the browser feature and the corresponding weight.
  • the device characteristics of the Internet access device acquired through the browser include the available screen resolution of the device, the actual screen resolution of the device, and the number of points that the device can touch.
  • the client-side features of the browser include audio stack fingerprints, the total number of logical processors available to the user agent by the system, browser plug-ins, browser-installed fonts, Canvas, WebGL, etc.
  • the system can send SDK packages to Internet devices through the browser, and read the device characteristics of the Internet devices through the SDK packages. Among them, the SDK package is used to analyze the device characteristics of the Internet-connected device and send the device characteristics to the system. In addition, the system obtains the client characteristic data of the browser by reading relevant data of the front-end browser. Further, the hash value and weight of the device feature of the Internet device, and the hash value and weight of the client feature are used as the hash value and corresponding weight of the browser feature to perform the calculation in step S300.
  • the obtaining the weight corresponding to the browser feature includes: obtaining the weight of each browser feature according to the missing value of each browser feature. Or, it includes: obtaining the weight of each browser feature according to the entropy value of each browser feature.
  • S300 Perform simhash calculation according to the hash value and the corresponding weight to obtain a browser feature value string corresponding to the browser feature.
  • the server performs simhash calculation according to the hash value of the browser feature and the corresponding weight to obtain the browser feature value string corresponding to the browser feature. It should be noted that the server performs numerical processing on the field of the browser feature of the historical device, and saves the relationship in the dictionary corresponding to the system. In this step, the same numerical processing method is also used to obtain the browser characteristic value string.
  • the simhash algorithm is a type of LSH (Locality-Sensitive Hashing) algorithm. Simhash is calculated to map a high-dimensional feature vector to a low-dimensional feature vector by way of dimensionality reduction, and then determine the similarity of the two vectors according to the Hamming distance of the two vectors after dimensionality reduction.
  • the acquiring the hash value and the corresponding weight of the browser feature, and performing simhash calculation according to the hash value and the corresponding weight, to obtain the browser feature value string corresponding to the browser feature include:
  • S201 Perform hash processing on each browser feature to obtain a hash value of each browser feature.
  • S301 Perform weighting processing on the corresponding hash value according to the weight of each browser feature to obtain a weighted value.
  • S305 Perform dimensionality reduction processing on the accumulated value to obtain the browser characteristic value string.
  • each browser feature has a different degree of influence on identifying the online device
  • the server sets different weights according to the degree of influence of each browser feature on identifying the online device.
  • the hash value of the browser feature is weighted according to the weight of each browser feature, and the weighted value is accumulated, and finally the accumulated value is subjected to dimensionality reduction processing to obtain the browser Characteristic value string.
  • 4 features are screen resolution, number of touch points, audio stack fingerprint, and WebGL, and their weights are (1, 2, 3, 4).
  • the calculation logic is as follows, see Figure 4 Show.
  • S400 Comparing the browser characteristic value string with system historical data to identify the online device; wherein the system historical data includes the value strings corresponding to the browser characteristics of multiple devices.
  • the server compares the browser characteristic value string with system historical data.
  • the server obtains the browser characteristic value string, it obtains the browser characteristic value string of the corresponding type of device in the server. If it is detected that the browser characteristic value string of this type of device does not exist in the dictionary where the server stores the corresponding value string, the obtained browser characteristic value string is directly stored in the dictionary and the device of this type is standardized. On the contrary, a pairwise comparison is performed with a set of historical values in the server dictionary to identify the online device.
  • step S400 includes the following steps:
  • S410 Determine whether the system history data includes data corresponding to the browser feature.
  • the server when it obtains the browser feature value string, it first determines whether there is a value string corresponding to the browser feature in the system. If so, compare the browser value string with the historical value string set pairwise, and calculate the Hamming distance between the two value strings (that is, after the exclusive OR operation, the number of 1 is the distance). If they are the same and within the threshold range, it is considered that the current Internet device and the device corresponding to the browser characteristic value string in the system are the same device, and the system will use the same device fingerprint ID. Otherwise, store the browser characteristic value string and associate the device fingerprint. Therefore, through the above calculation logic, a new device fingerprint fingerprint will be obtained for a new Internet device, and the same device will receive the same device fingerprint fingerprint, so as to record and track the target Internet device.
  • threshold range is determined according to the following manner:
  • the specific system determines the threshold range, and can obtain multiple Internet-connected devices in the system marked with numerical strings related to browser characteristics. At the same time, perform simhash calculation on the hash value and weight of each Internet device corresponding to the browser feature, and obtain each browser feature value string. Furthermore, the Hamming distance value calculation is performed on each browser characteristic value string and the corresponding value string stored in the system to obtain multiple calculation result values. Analyze the value range of the multiple calculation result values, and use the value range as the threshold range.
  • the determining the threshold range according to the calculation result includes: obtaining the Hamming distance value of the browser characteristic value string corresponding to each Internet device and the corresponding value string marked in the system to obtain multiple A Hamming distance value; a maximum value and a minimum value are selected from the multiple Hamming distance values; a numerical range formed by a natural value between the minimum value and the maximum value is used as the threshold range.
  • multiple Hamming distance values obtained from multiple Internet devices marked with numerical strings related to browser features in the system are used as training sample values, and the maximum and minimum Hamming distance values in the training sample values are obtained.
  • the threshold range is a natural numerical range between the maximum value and the minimum value.
  • the method for identifying the Internet device obtaineds the weight corresponding to the browser feature of the Internet device and the hash value of the browser feature, and performs simhash calculation based on the two to de-duplicate the browser feature through the simhash algorithm to obtain The corresponding browser characteristic value string. Finally, according to the comparison result of the browser characteristic value string and the system historical data, the Internet device is identified.
  • the above method adopts the characteristics of the browser corresponding to the Internet device and calculates by simhash to accurately identify the Internet device, so as to realize the accurate recording and tracking of the Internet device.
  • different device browser fingerprints are obtained through algorithms. The same Internet-connected device obtains the same device browser fingerprint through the algorithm, therefore, the user can be tracked and identified without perception.
  • the identification device of the Internet access device includes a first acquisition module 10, a second acquisition module 20, a calculation module 30 and an identification module 40.
  • the first obtaining module 10 is used to obtain the browser characteristics of the Internet-connected device.
  • the server obtains multiple browser features of the Internet device.
  • Browser features can include the available screen resolution of the device, the actual screen resolution of the device, the number of touchable points on the device, audio stack fingerprints, the total number of logical processors available to the user agent by the system, browser plug-ins, and browser installation fonts , Canvas, WebGL, etc.
  • the server accesses the browser API through JavaScript to obtain multiple browser features.
  • the multiple browser features include the characteristics of the feedback browser itself, as well as the characteristics of the software and hardware of the operating system. In a specific implementation, at least 50 browser features of the Internet-connected device can be obtained. Therefore, a combination of multiple browser features can be used to increase the entropy value of the feature and avoid the low entropy value of a single feature information, thereby determining an accurate unique device fingerprint.
  • the second obtaining module 20 is configured to obtain the hash value and the corresponding weight of the browser feature.
  • the server hashes the acquired browser features to obtain the corresponding hash value.
  • the weight of the browser feature is determined according to data statistics, such as the stability of the browser feature (how many missing values) and the entropy of the browser feature, etc.
  • the calculation module 30 is configured to perform simhash calculation according to the hash value and the corresponding weight to obtain the browser feature value string corresponding to the browser feature.
  • the server performs simhash calculation according to the hash value of the browser feature and the corresponding weight to obtain the browser feature value string corresponding to the browser feature. It should be noted that the server performs numerical processing on the field of the browser feature of the historical device, and saves the relationship in the dictionary corresponding to the system. In this module, the same numerical processing method is also used to obtain the browser characteristic numerical string.
  • the identification module 40 is configured to compare the browser characteristic value string with system historical data to identify the online device; wherein, the system historical data includes a number string corresponding to the browser characteristics of multiple devices.
  • the server compares the browser characteristic value string with system historical data.
  • the server obtains the browser characteristic value string, it obtains the browser characteristic value string of the corresponding type of device in the server. If it is detected that the browser characteristic value string of this type of device does not exist in the dictionary where the server stores the corresponding value string, the obtained browser characteristic value string is directly stored in the dictionary and the device of this type is standardized. On the contrary, a pairwise comparison is performed with a set of historical values in the server dictionary to identify the online device.
  • each module in the device for identifying Internet access equipment provided in this application is also used to perform operations corresponding to each step in the method for identifying Internet access equipment described in this application, which will not be described in detail here. .
  • the application also provides a storage medium.
  • the storage medium stores a computer program; when the computer program is executed by the processor, it implements the method for identifying an Internet device according to any one of the above embodiments.
  • the storage medium may be a memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the storage medium disclosed in this application includes but is not limited to these types of memories.
  • the memory disclosed in this application is only an example and not a limitation.
  • a computer device includes: one or more processors; memory; and one or more application programs. Wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute the one described in any of the above embodiments The method of identifying your internet device.
  • FIG. 7 is a schematic structural diagram of a computer device in an embodiment of the application.
  • the computer device described in this embodiment may be a server, a personal computer, and a network device.
  • the device includes a processor 703, a memory 705, an input unit 707, a display unit 709 and other devices.
  • the memory 705 may be used to store an application program 701 and various functional modules, and the processor 703 runs the application program 701 stored in the memory 705 to execute various functional applications and data processing of the device.
  • the memory may be internal memory or external memory, or include both internal memory and external memory.
  • the internal memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • flash memory or random access memory.
  • External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
  • the memory disclosed in this application includes but is not limited to these types of memory.
  • the memory disclosed in this application is only an example and not a limitation.
  • the input unit 707 is used to receive input of signals and keywords input by the user.
  • the input unit 707 may include a touch panel and other input devices.
  • the touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset
  • the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 709 can be used to display information input by the user or information provided to the user and various menus of the computer device.
  • the display unit 709 can take the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the processor 703 is the control center of the computer equipment. It uses various interfaces and lines to connect the various parts of the entire computer. By running or executing the software programs and/or modules stored in the memory 705, and calling the data stored in the memory, execute Various functions and processing data.
  • the device includes one or more processors 703, one or more memories 705, and one or more application programs 701.
  • the one or more application programs 701 are stored in the memory 705 and are configured to be executed by the one or more processors 703, and the one or more application programs 701 are configured to execute the foregoing embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium.

Landscapes

  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Pure & Applied Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)
  • Storage Device Security (AREA)

Abstract

本申请提供一种上网设备的识别方法、装置及存储介质、计算机设备,所述方法包括:获取上网设备的浏览器特征;获取所述浏览器特征的哈希值及对应的权重;根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。上述方法采用上网设备对应浏览器特征,通过simhash计算,以准确识别出上网设备,从而可实现对上网设备的准确记录与跟踪。

Description

上网设备的识别方法、装置及存储介质、计算机设备
本申请要求于2019年02月13日提交中国专利局、申请号为201910113122.4、申请名称为“上网设备的识别方法、装置及存储介质、计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及上网设备研究技术领域,具体而言,本申请涉及一种上网设备的识别方法、装置及存储介质、计算机设备。
背景技术
目前市面上的浏览器设备指纹主要为两种,一种在Cookie中种入UUID;另外一种则利用Canvas绘制图形取CRC校验码(图像处理引擎,不同的导出选项,不同的压缩等级),给用户设备分配特定编号(指纹),特定编号用于识别不同设备。
其中,Cookie缺点:随着互联网对个人隐私的重视,Cookie越来越不受待见,不少安全工具甚至是浏览器都开始允许或引导用户关闭Cookie功能,这就导致使用Cookie的收效甚微。
Canvas缺点:准确性上有局限性,可能存在相同型号相同浏览器的不同设备出现相同的Canvas,导致设备指纹重复。
发明内容
本申请提出一种上网设备的识别方法、装置及存储介质、计算机设备,以准确识别出用户端中的上网设备,从而可实现对上网设备的准确记录与跟踪。
本申请提供以下方案:
一种上网设备的识别方法,包括:获取上网设备的浏览器特征;获取所述浏览器特征的哈希值及对应的权重;根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
一种上网设备的识别装置,包括:第一获取模块,用于获取上网设备的浏览器特征;第二获取模块,用于获取所述浏览器特征的哈希值及对应的权重;计算模块,用于根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;识别模块,用于将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
一种存储介质,其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述任一实施例所述的上网设备的识别方法。
一种计算机设备,其包括:一个或多个处理器;存储器;一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行根据上述任一实施例所述的上网设备的识别方法。
上述实施例提供的上网设备的识别方法,获取上网设备浏览器特征对应的权重以及浏览器特征的哈希值,根据这两者进行simhash计算,以通过simhash算法对浏览器特征进行去重,得到对应的浏览器特征数值串。最终根据浏览器特征数值串与系统历史数据的对比结果,识别出上网设备。上述方法采用上网设备对应浏览器特征,通过simhash计算,以准确识别出上网设备,从而可实现对上网设备的准确记录与跟踪。
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本申请提供的服务器与上网设备之间的一实施例中的交互示意图;
图2为本申请提供的一种上网设备的识别方法的一实施例中的方法流程图;
图3为本申请提供的一种上网设备的识别方法的另一实施例中的方法流程图;
图4为本申请提供的一种上网设备的识别方法的一具体实施方式中的流程示意图;
图5为本申请提供的步骤S400的一实施例中的方法流程图;
图6为本申请提供的一种上网设备的识别装置的一实施例中的结构框图;
图7为本申请提供的一种计算机设备的一实施例中的结构示意图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出。
本申请提供的一种上网设备的识别方法,适用于各种向前端提供浏览器,并通过浏览器识别用户端的上网设备属性的服务器系统。在一实施例中,该上网设备的识别方法应用于如图1所示的应用环境中。
如图1所示,服务器100与用户端的上网设备300位于同一个网络200环境中,服务器100与用户端的上网设备300通过网络200进行数据信息的交互。在本实施例中,用户端的上网设备300与服务器100进行网络通信。服务器100与用户终端300的数量不作限定,图1所示只作为示例说明。用户端的上网设备300中安装有浏览器客户端,用于对服务器100进行网络访问。用户可以通过上网设备300中的浏览器客户端与对应的服务器100进行信息交互。浏览器客户端与服务器(Server)端相对应,共同遵循同一套数据协议,使得服务器端跟浏览器客户端能够互相解析出对方的数据。
服务器100可以是,但不限于,网络服务器、管理服务器、应用程序服务器、数据库服务器、云端服务器等等。用户端的上网设备300可以是,但不限于智能手机、个人电脑(personal computer,PC)、平板电脑、个人数字助理(personal digital assistant,PDA)、移动上网设备(mobile Internet device,MID)等。用户端的上网设备300的操作系统可以是,但不限于,安卓(Android)系统、IOS(IPhone operating system)系统、Windows phone系统、Windows系统等。
本申请提供一种上网设备的识别方法。在一实施例中,如图2所示,该上网设备的识别方法,包括以下步骤:
S100,获取上网设备的浏览器特征。
在本实施例中,服务器获取上网设备的多个浏览器特征。浏览器特征可包括设备可用屏幕分辨率、设备实际屏幕分辨率、设备可触控的点的个数、音频栈指纹、系统对用户代理可用的逻辑处理器总数、浏览器插件、浏览器安装字体、Canvas、WebGL等。具体地,服务器通过JavaScript访问浏览器API获取多个浏览器特征。多个浏览器特征包含有反馈浏览器本身特性的特征,也包含有反馈操作系统软硬件特性。在一具体实施方式中,可获取上网设备的至少50个浏览器特征。因此,可使用多个浏览器特征进行组合以提高特征的熵值,避免单个特征信息低熵值,从而确定出准确的唯一设备指纹。在一实施方式中,所述上网设备的浏览器特征包括设备可用屏幕分辨率、设备实际屏幕分辨率、设备可触控的点的个数、音频栈指纹、系统对用户代理可用的逻辑处理器总数、浏览器插件、浏览器安装字体、Canvas、WebGL中的一个或多个。
具体地,设备实际屏幕分辨率是指设备屏幕宽和高的像素,如1920*1080。设备可用屏幕分辨率,是指应用可以使用的屏幕宽和高像素,如windows任务栏高度为40px,那么他的可用屏幕分辨率为1920*(1080-40)=1920*1040。音频栈指纹则由7部分组成的串,如:48000_2_1_0_2_explicit_speakers。其中,
第一位:取样频率。
第二位:最大频道数。
第三位:输入字节数。
第四位:输出字节数。
第五位:频道数。
第六位:ETS如何计算内部值时,向上或向下混合连接到音频输入。
第七位:表示描述通道含义的枚举值。将定义音频上混合和下混合将如何发生。
S200,获取所述浏览器特征的哈希值及对应的权重。
在本实施例中,服务器将获取到的浏览器特征进行哈希处理,得到对应的哈希值。同时,确定每个浏览器特征的权重。其中,浏览器特征的权 重根据数据统计确定,如浏览器特征的稳定性(缺失值多少)和浏览器特征的熵大小等。
在一实施例中,所述上网设备的浏览器特征包括通过浏览器获取的上网设备的设备特征以及浏览器的客户端特征。此时,步骤S100包括:通过所述浏览器向所述上网设备发送SDK包,利用所述SDK包获取所述上网设备的设备特征;通过读取所述浏览器的客户端数据获取所述客户端特征。步骤S200包括:获取所述设备特征的哈希值以及所述设备特征的权重;获取所述客户端特征的哈希值以及所述客户端特征的权重;将所述设备特征的哈希值与所述设备特征的权重,以及所述客户端特征的哈希值以及所述客户端特征的权重作为所述浏览器特征的哈希值及对应的权重。
在该实施例中,通过浏览器获取的上网设备的设备特征包括设备可用屏幕分辨率、设备实际屏幕分辨率以及设备可触控的点的个数等。浏览器的客户端特征包括音频栈指纹、系统对用户代理可用的逻辑处理器总数、浏览器插件、浏览器安装字体、Canvas、WebGL等。系统可通过浏览器向上网设备发送SDK包,通过SDK包读取上网设备的设备特征。其中,SDK包用于解析上网设备的设备特征,并将所述设备特征发送给系统。此外,系统通过读取前端浏览器的相关数据,获取浏览器的客户端特征数据。进一步地,将上网设备的设备特征的哈希值和其权重、客户端特征的哈希值和其权重,作为所述浏览器特征的哈希值及对应的权重,以进行步骤S300的计算。
在一实施方式中,步骤S200中,所述获取所述浏览器特征对应的权重,包括:根据每个所述浏览器特征的缺失值获取每个所述浏览器特征的权重。或者,包括:根据每个所述浏览器特征的熵值获取每个所述浏览器特征的权重。
S300,根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串。
在本实施例中,服务器根据浏览器特征的哈希值及对应的权重进行simhash计算,得到浏览器特征对应的浏览器特征数值串。需要说明的是,服务器对历史设备的浏览器特征的字段进行数值化处理,并把关系保存在 系统对应的字典中。此步骤中,也是采用同样的数值化处理方法,得到所述浏览器特征数值串。其中,simhash算法是LSH(Locality-Sensitive Hashing,局部敏感哈希)算法的一种。simhash计算为通过降维的方式,将高维的特征向量映射成低维的特征向量,再根据降维后的两个向量的海明距离确定两向量的相似性。
在一实施例中,所述浏览器特征为多个。如图3所示,所述获取所述浏览器特征的哈希值及对应的权重,根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串,包括:
S201,对每个所述浏览器特征进行哈希处理,得到每个所述浏览器特征的哈希值。
S203,获取每个所述浏览器特征对应的权重。
S301,根据每个所述浏览器特征的权重对对应的哈希值进行加权处理,得到加权后的数值。
S303,将多个浏览器特征对应的加权后的数值进行合并累加,得到累加后的数值。
S305,将累加后的数值进行降维处理,得到所述浏览器特征数值串。
在该实施例中,每个浏览器特征对于识别所述上网设备的影响度不同,服务器根据每个浏览器特征对识别所述上网设备的影响度设置不同的权重。进一步地,根据每个所述浏览器特征的权重对浏览器特征的哈希值进行加权处理,并将加权后的数值进行累加,最后将累加后的数值进行降维处理后得到所述浏览器特征数值串。在一具体实施方式中,举例4个特征分别为屏幕分辨率、可触点个数、音频栈指纹以及WebGL,其权重分别为(1、2、3、4),计算逻辑如下参见图4所示。
S400,将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
在本实施例中,服务器将浏览器特征数值串与系统历史数据进行对比。此处,当服务器获取到浏览器特征数值串之后,获取服务器中对应类型设备的浏览器特征数值串。若检测到服务器存储对应数值串的字典中不存在 该类型设备的浏览器特征数值串,则直接将获取到的浏览器特征数值串存储在字典中,并标准该类型的设备。反之,与服务器字典中历史存在的数值串集合进行两两比较,以识别出所述上网设备。
在一实施例中,如图5所示,步骤S400,包括以下步骤:
S410,判断所述系统历史数据中是否包含有所述浏览器特征对应的数据。
S420,若有,将所述浏览器特征数值串与所述系统历史数据中数值串集合进行两两比较,计算两者之间的海明距离值;若所述海明距离值在阈值范围内,则确定所述上网设备与系统中该数值串对应的设备为相同设备,可将两者设置相同的设备指纹。
S430,若无,对所述上网设备设置设备指纹,并将该设备指纹与所述浏览器特征数值串关联。
在该实施例中,服务器获取到浏览器特征数值串时,先确定系统中是否存在该浏览器特征对应的数值串。若有,则将该浏览器数值串与历史已经存在的数值串集合进行两两比较,计算两数值串之间的海明距离(即异或运算后,得到1的个数为距离),距离相同且在阈值范围内,则认为当前的上网设备与系统中浏览器特征数值串对应的设备为相同设备,系统将采用相同的设备指纹ID。反之,则存储该浏览器特征数值串,并关联该设备指纹。因此,通过以上计算逻辑对新的上网设备会得到新的设备指纹fingerprint,相同设备得到相同的设备指纹fingerprint,达到对目标上网设备的记录与跟踪。
进一步地,所述阈值范围根据以下方式确定:
获取系统历史数据中多个标记有与浏览器特征相关的数值串的上网设备;分别获取该多个上网设备中每个上网设备的浏览器特征的哈希值以及每个浏览器特征对应的权重;根据每个上网设备对应的哈希值以及权重进行simhash计算,得到每个上网设备对应的浏览器特征数值串;分别将每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串进行海明距离值计算,根据计算结果确定出所述阈值范围。
具体系统对于阈值范围的确定,可获取系统中多个标记有浏览器特征 相关的数值串的上网设备。同时,将每个上网设备对应浏览器特征的哈希值以及权重进行simhash计算,分别得到每个浏览器特征数值串。进一步分别将每个浏览器特征数值串与系统中存储的对应的数值串进行海明距离值计算,得到多个计算结果值。分析该多个计算结果值的取值范围,将该取值范围作为所述阈值范围。
在一实施例中,所述根据计算结果确定出所述阈值范围,包括:获取每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串的海明距离值,得到多个海明距离值;从所述多个海明距离值中筛选出最大值和最小值;将所述最小值和所述最大值之间的自然数值形成的数值范围作为所述阈值范围。
具体地,将根据系统中多个标记有浏览器特征相关的数值串的上网设备得到的多个海明距离值作为训练样本值,获取训练样本值中海明距离值的最大值和最小值。所述阈值范围为所述最大值和所述最小值之间的自然数值范围。
上述实施例提供的上网设备的识别方法,获取上网设备浏览器特征对应的权重以及浏览器特征的哈希值,根据这两者进行simhash计算,以通过simhash算法对浏览器特征进行去重,得到对应的浏览器特征数值串。最终根据浏览器特征数值串与系统历史数据的对比结果,识别出上网设备。上述方法采用上网设备对应浏览器特征,通过simhash计算,以准确识别出上网设备,从而可实现对上网设备的准确记录与跟踪。并且,对于不同上网设备,通过算法得到不同的设备浏览器指纹。相同上网设备,通过算法得到相同的设备浏览器指纹,因此,可实现对用户进行无感知的跟踪与标识。
本申请还提供一种上网设备的识别装置。在一实施例中,如图6所示,该上网设备的识别装置包括第一获取模块10、第二获取模块20、计算模块30以及识别模块40。
第一获取模块10用于获取上网设备的浏览器特征。在本实施例中,服务器获取上网设备的多个浏览器特征。浏览器特征可包括设备可用屏幕分辨率、设备实际屏幕分辨率、设备可触控的点的个数、音频栈指纹、系统 对用户代理可用的逻辑处理器总数、浏览器插件、浏览器安装字体、Canvas、WebGL等。具体地,服务器通过JavaScript访问浏览器API获取多个浏览器特征。多个浏览器特征包含有反馈浏览器本身特性的特征,也包含有反馈操作系统软硬件特性。在一具体实施方式中,可获取上网设备的至少50个浏览器特征。因此,可使用多个浏览器特征进行组合以提高特征的熵值,避免单个特征信息低熵值,从而确定出准确的唯一设备指纹。
第二获取模块20用于获取所述浏览器特征的哈希值及对应的权重。在本实施例中,服务器将获取到的浏览器特征进行哈希处理,得到对应的哈希值。同时,确定每个浏览器特征的权重。其中,浏览器特征的权重根据数据统计确定,如浏览器特征的稳定性(缺失值多少)和浏览器特征的熵大小等。
计算模块30用于根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串。在本实施例中,服务器根据浏览器特征的哈希值及对应的权重进行simhash计算,得到浏览器特征对应的浏览器特征数值串。需要说明的是,服务器对历史设备的浏览器特征的字段进行数值化处理,并把关系保存在系统对应的字典中。此模块中,也是采用同样的数值化处理方法,得到所述浏览器特征数值串。
识别模块40用于将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。在本实施例中,服务器将浏览器特征数值串与系统历史数据进行对比。此处,当服务器获取到浏览器特征数值串之后,获取服务器中对应类型设备的浏览器特征数值串。若检测到服务器存储对应数值串的字典中不存在该类型设备的浏览器特征数值串,则直接将获取到的浏览器特征数值串存储在字典中,并标准该类型的设备。反之,与服务器字典中历史存在的数值串集合进行两两比较,以识别出所述上网设备。
在其他实施例中,本申请提供的上网设备的识别装置中的各个模块还用于执行本申请所述的上网设备的识别方法中,对应各个步骤执行的操作,在此不再做详细的说明。
本申请还提供一种存储介质。该存储介质上存储有计算机程序;所述 计算机程序被处理器执行时,实现上述任一实施例所述的上网设备的识别方法。该存储介质可以是存储器。例如,内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储介质包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。
本申请还提供一种计算机设备。一种计算机设备包括:一个或多个处理器;存储器;一个或多个应用程序。其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行上述任一实施例所述的上网设备的识别方法。
图7为本申请一实施例中的计算机设备的结构示意图。本实施例所述计算机设备可以是服务器、个人计算机以及网络设备。如图7所示,设备包括处理器703、存储器705、输入单元707以及显示单元709等器件。本领域技术人员可以理解,图7示出的设备结构器件并不构成对所有设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储器705可用于存储应用程序701以及各功能模块,处理器703运行存储在存储器705的应用程序701,从而执行设备的各种功能应用以及数据处理。存储器可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储器包括但不限于这些类型的存储器。本申请所公开的存储器只作为例子而非作为限定。
输入单元707用于接收信号的输入,以及接收用户输入的关键字。输入单元707可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如 播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元709可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元709可采用液晶显示器、有机发光二极管等形式。处理器703是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储器705内的软件程序和/或模块,以及调用存储在存储器内的数据,执行各种功能和处理数据。
在一实施方式中,设备包括一个或多个处理器703,以及一个或多个存储器705,一个或多个应用程序701。其中所述一个或多个应用程序701被存储在存储器705中并被配置为由所述一个或多个处理器703执行,所述一个或多个应用程序701配置用于执行以上实施例所述的上网设备的识别方法。
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括存储器、磁盘或光盘等。
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。在本申请各实施例中的各功能单元可集成在一个处理模块中,也可以各个单元单独物理存在,也可以两个或两个以上单元集成于一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (20)

  1. 一种上网设备的识别方法,其特征在于,包括:
    获取上网设备的浏览器特征;
    获取所述浏览器特征的哈希值及对应的权重;
    根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;
    将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
  2. 根据权利要求1所述的方法,其特征在于,所述上网设备的浏览器特征包括通过浏览器获取的上网设备的设备特征以及浏览器的客户端特征;
    所述获取上网设备的浏览器特征,包括:
    通过所述浏览器向所述上网设备发送SDK包,利用所述SDK包获取所述上网设备的设备特征;
    通过读取所述浏览器的客户端数据获取所述客户端特征;
    所述获取所述浏览器特征的哈希值及对应的权重,包括:
    获取所述设备特征的哈希值以及所述设备特征的权重;
    获取所述客户端特征的哈希值以及所述客户端特征的权重;
    将所述设备特征的哈希值与所述设备特征的权重,以及所述客户端特征的哈希值以及所述客户端特征的权重作为所述浏览器特征的哈希值及对应的权重。
  3. 根据权利要求1所述的方法,其特征在于,所述获取所述浏览器特征对应的权重,包括:
    根据每个所述浏览器特征的缺失值获取每个所述浏览器特征的权重;或,
    根据每个所述浏览器特征的熵值获取每个所述浏览器特征的权重。
  4. 根据权利要求1所述的方法,其特征在于,所述浏览器特征为多个;所述获取所述浏览器特征的哈希值及对应的权重,根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串,包括:
    对每个所述浏览器特征进行哈希处理,得到每个所述浏览器特征的哈希值;
    获取每个所述浏览器特征对应的权重;
    根据每个所述浏览器特征的权重对对应的哈希值进行加权处理,得到加权后的数值;
    将多个浏览器特征对应的加权后的数值进行合并累加,得到累加后的数值;
    将累加后的数值进行降维处理,得到所述浏览器特征数值串。
  5. 根据权利要求1所述的方法,其特征在于,所述将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备,包括:
    判断所述系统历史数据中是否包含有所述浏览器特征对应的数据;
    若有,将所述浏览器特征数值串与所述系统历史数据中数值串集合进行两两比较,计算两者之间的海明距离值;若所述海明距离值在阈值范围内,则确定所述上网设备与系统中该数值串对应的设备为相同设备,可将两者设置相同的设备指纹;
    若无,对所述上网设备设置设备指纹,并将该设备指纹与所述浏览器特征数值串关联。
  6. 根据权利要求5所述的方法,其特征在于,所述阈值范围根据以下方式确定:
    获取系统历史数据中多个标记有与浏览器特征相关的数值串的上网设备;
    分别获取该多个上网设备中每个上网设备的浏览器特征的哈希值以及每个浏览器特征对应的权重;
    根据每个上网设备对应的哈希值以及权重进行simhash计算,得到每个上网设备对应的浏览器特征数值串;
    分别将每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串进行海明距离值计算,根据计算结果确定出所述阈值范围。
  7. 根据权利要求6所述的方法,其特征在于,所述根据计算结果确定出所述阈值范围,包括:
    获取每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串的海明距离值,得到多个海明距离值;
    从所述多个海明距离值中筛选出最大值和最小值;
    将所述最小值和所述最大值之间的自然数值形成的数值范围作为所述阈值范围。
  8. 一种上网设备的识别装置,其特征在于,包括:
    第一获取模块,用于获取上网设备的浏览器特征;
    第二获取模块,用于获取所述浏览器特征的哈希值及对应的权重;
    计算模块,用于根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;
    识别模块,用于将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
  9. 根据权利要求8所述的装置,其特征在于,所述上网设备的浏览器特征包括通过浏览器获取的上网设备的设备特征以及浏览器的客户端特征;
    所述第一获取模块具体用于:
    通过所述浏览器向所述上网设备发送SDK包,利用所述SDK包获取所述上网设备的设备特征;
    通过读取所述浏览器的客户端数据获取所述客户端特征;
    所述第二获取模块具体用于:
    获取所述设备特征的哈希值以及所述设备特征的权重;
    获取所述客户端特征的哈希值以及所述客户端特征的权重;
    将所述设备特征的哈希值与所述设备特征的权重,以及所述客户端特征的哈希值以及所述客户端特征的权重作为所述浏览器特征的哈希值及对应的权重。
  10. 根据权利要求8所述的装置,其特征在于,所述第二获取模块获取所述浏览器特征对应的权重时,具体用于:
    根据每个所述浏览器特征的缺失值获取每个所述浏览器特征的权重;或,
    根据每个所述浏览器特征的熵值获取每个所述浏览器特征的权重。
  11. 根据权利要求8所述的装置,其特征在于,所述浏览器特征为多个;所述计算模块具体用于:
    对每个所述浏览器特征进行哈希处理,得到每个所述浏览器特征的哈希值;
    获取每个所述浏览器特征对应的权重;
    根据每个所述浏览器特征的权重对对应的哈希值进行加权处理,得到加权 后的数值;
    将多个浏览器特征对应的加权后的数值进行合并累加,得到累加后的数值;
    将累加后的数值进行降维处理,得到所述浏览器特征数值串。
  12. 根据权利要求8所述的装置,其特征在于,所述识别模块具体用于:
    判断所述系统历史数据中是否包含有所述浏览器特征对应的数据;
    若有,将所述浏览器特征数值串与所述系统历史数据中数值串集合进行两两比较,计算两者之间的海明距离值;若所述海明距离值在阈值范围内,则确定所述上网设备与系统中该数值串对应的设备为相同设备,可将两者设置相同的设备指纹;
    若无,对所述上网设备设置设备指纹,并将该设备指纹与所述浏览器特征数值串关联。
  13. 根据权利要求12所述的装置,其特征在于,所述阈值范围根据以下方式确定:
    获取系统历史数据中多个标记有与浏览器特征相关的数值串的上网设备;
    分别获取该多个上网设备中每个上网设备的浏览器特征的哈希值以及每个浏览器特征对应的权重;
    根据每个上网设备对应的哈希值以及权重进行simhash计算,得到每个上网设备对应的浏览器特征数值串;
    分别将每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串进行海明距离值计算,根据计算结果确定出所述阈值范围。
  14. 根据权利要求13所述的装置,其特征在于,所述识别模块根据计算结果确定出所述阈值范围时,具体用于:
    获取每个上网设备对应的浏览器特征数值串与系统中标记的对应的数值串的海明距离值,得到多个海明距离值;
    从所述多个海明距离值中筛选出最大值和最小值;
    将所述最小值和所述最大值之间的自然数值形成的数值范围作为所述阈值范围。
  15. 一种计算机非易失性可读存储介质,其特征在于,其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述权利要求1至7中任一项 所述的上网设备的识别方法。
  16. 一种计算机设备,其特征在于,其包括:
    一个或多个处理器;
    存储器;
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行以下步骤:
    获取上网设备的浏览器特征;
    获取所述浏览器特征的哈希值及对应的权重;
    根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串;
    将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备;其中,所述系统历史数据包括多个设备的浏览器特征对应的数值串。
  17. 根据权利要求16所述的计算机设备,其特征在于,所述上网设备的浏览器特征包括通过浏览器获取的上网设备的设备特征以及浏览器的客户端特征;
    所述获取上网设备的浏览器特征时,所述一个或多个应用程序被配置用于执行以下步骤:
    通过所述浏览器向所述上网设备发送SDK包,利用所述SDK包获取所述上网设备的设备特征;
    通过读取所述浏览器的客户端数据获取所述客户端特征;
    所述获取所述浏览器特征的哈希值及对应的权重时,所述一个或多个应用程序被配置用于执行以下步骤:
    获取所述设备特征的哈希值以及所述设备特征的权重;
    获取所述客户端特征的哈希值以及所述客户端特征的权重;
    将所述设备特征的哈希值与所述设备特征的权重,以及所述客户端特征的哈希值以及所述客户端特征的权重作为所述浏览器特征的哈希值及对应的权重。
  18. 根据权利要求16所述的计算机设备,其特征在于,所述获取所述浏 览器特征对应的权重时,所述一个或多个应用程序被配置用于执行以下步骤:
    根据每个所述浏览器特征的缺失值获取每个所述浏览器特征的权重;或,
    根据每个所述浏览器特征的熵值获取每个所述浏览器特征的权重。
  19. 根据权利要求16所述的计算机设备,其特征在于,所述浏览器特征为多个;所述获取所述浏览器特征的哈希值及对应的权重,根据所述哈希值及对应的权重进行simhash计算,得到所述浏览器特征对应的浏览器特征数值串时,所述一个或多个应用程序被配置用于执行以下步骤:
    对每个所述浏览器特征进行哈希处理,得到每个所述浏览器特征的哈希值;
    获取每个所述浏览器特征对应的权重;
    根据每个所述浏览器特征的权重对对应的哈希值进行加权处理,得到加权后的数值;
    将多个浏览器特征对应的加权后的数值进行合并累加,得到累加后的数值;
    将累加后的数值进行降维处理,得到所述浏览器特征数值串。
  20. 根据权利要求16所述的计算机设备,其特征在于,所述将所述浏览器特征数值串与系统历史数据进行对比,识别所述上网设备时,所述一个或多个应用程序被配置用于执行以下步骤:
    判断所述系统历史数据中是否包含有所述浏览器特征对应的数据;
    若有,将所述浏览器特征数值串与所述系统历史数据中数值串集合进行两两比较,计算两者之间的海明距离值;若所述海明距离值在阈值范围内,则确定所述上网设备与系统中该数值串对应的设备为相同设备,可将两者设置相同的设备指纹;
    若无,对所述上网设备设置设备指纹,并将该设备指纹与所述浏览器特征数值串关联。
PCT/CN2019/117866 2019-02-13 2019-11-13 上网设备的识别方法、装置及存储介质、计算机设备 WO2020164272A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910113122.4 2019-02-13
CN201910113122.4A CN109995576A (zh) 2019-02-13 2019-02-13 上网设备的识别方法、装置及存储介质、计算机设备

Publications (1)

Publication Number Publication Date
WO2020164272A1 true WO2020164272A1 (zh) 2020-08-20

Family

ID=67129297

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117866 WO2020164272A1 (zh) 2019-02-13 2019-11-13 上网设备的识别方法、装置及存储介质、计算机设备

Country Status (2)

Country Link
CN (1) CN109995576A (zh)
WO (1) WO2020164272A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117056912A (zh) * 2023-08-15 2023-11-14 浙江齐安信息科技有限公司 基于canvas指纹的操作系统识别方法、设备及介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109995576A (zh) * 2019-02-13 2019-07-09 平安科技(深圳)有限公司 上网设备的识别方法、装置及存储介质、计算机设备
CN111181912B (zh) * 2019-08-27 2021-10-15 腾讯科技(深圳)有限公司 浏览器标识的处理方法、装置、电子设备及存储介质
CN111400695B (zh) * 2020-04-09 2024-05-10 中国建设银行股份有限公司 一种设备指纹生成方法、装置、设备和介质
CN112765578B (zh) * 2021-01-26 2022-09-16 上海黔易数据科技有限公司 一种基于浏览器客户端的安全隐私计算的实现方法
CN113177144A (zh) * 2021-04-28 2021-07-27 Oppo广东移动通信有限公司 用户识别方法、用户识别装置、电子设备及介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9298757B1 (en) * 2013-03-13 2016-03-29 International Business Machines Corporation Determining similarity of linguistic objects
CN106407203A (zh) * 2015-07-29 2017-02-15 阿里巴巴集团控股有限公司 一种对目标终端进行识别的方法和设备
CN106453437A (zh) * 2016-12-22 2017-02-22 中国银联股份有限公司 一种设备识别码获取方法及装置
CN106599227A (zh) * 2016-12-19 2017-04-26 北京天广汇通科技有限公司 用于获取基于属性值的对象之间的相似度的方法与装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质
CN108881513A (zh) * 2018-06-29 2018-11-23 深圳鼎盛电脑科技有限公司 一种设备码生成的方法、装置、设备及存储介质
CN109995576A (zh) * 2019-02-13 2019-07-09 平安科技(深圳)有限公司 上网设备的识别方法、装置及存储介质、计算机设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8733732B2 (en) * 2010-05-24 2014-05-27 Eaton Corporation Pressurized o-ring pole piece seal for a manifold
CN105100164B (zh) * 2014-05-20 2018-06-15 深圳市腾讯计算机系统有限公司 网络服务推荐方法和装置
CN107770805B (zh) * 2016-08-22 2021-07-27 腾讯科技(深圳)有限公司 终端的标识信息的判定方法及装置
CN106650382A (zh) * 2016-12-30 2017-05-10 北京工业大学 一种基于浏览器的高性能用户追踪方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9298757B1 (en) * 2013-03-13 2016-03-29 International Business Machines Corporation Determining similarity of linguistic objects
CN106407203A (zh) * 2015-07-29 2017-02-15 阿里巴巴集团控股有限公司 一种对目标终端进行识别的方法和设备
CN106599227A (zh) * 2016-12-19 2017-04-26 北京天广汇通科技有限公司 用于获取基于属性值的对象之间的相似度的方法与装置
CN106453437A (zh) * 2016-12-22 2017-02-22 中国银联股份有限公司 一种设备识别码获取方法及装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质
CN108881513A (zh) * 2018-06-29 2018-11-23 深圳鼎盛电脑科技有限公司 一种设备码生成的方法、装置、设备及存储介质
CN109995576A (zh) * 2019-02-13 2019-07-09 平安科技(深圳)有限公司 上网设备的识别方法、装置及存储介质、计算机设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117056912A (zh) * 2023-08-15 2023-11-14 浙江齐安信息科技有限公司 基于canvas指纹的操作系统识别方法、设备及介质
CN117056912B (zh) * 2023-08-15 2024-06-11 浙江齐安信息科技有限公司 基于canvas指纹的操作系统识别方法、设备及介质

Also Published As

Publication number Publication date
CN109995576A (zh) 2019-07-09

Similar Documents

Publication Publication Date Title
WO2020164272A1 (zh) 上网设备的识别方法、装置及存储介质、计算机设备
US20190166141A1 (en) Detection of malicious activity using behavior data
US10534931B2 (en) Systems, devices and methods for automatic detection and masking of private data
WO2020108063A1 (zh) 特征词的确定方法、装置和服务器
US20220230465A1 (en) Sectionizing documents based on visual and language models
WO2021068563A1 (zh) 样本数据处理方法、装置、计算机设备及存储介质
US11494559B2 (en) Hybrid in-domain and out-of-domain document processing for non-vocabulary tokens of electronic documents
WO2017005207A1 (zh) 一种输入方法、输入装置、服务器和输入系统
RU2722692C1 (ru) Способ и система выявления вредоносных файлов в неизолированной среде
CN111159413A (zh) 日志聚类方法、装置、设备及存储介质
US20210209482A1 (en) Method and apparatus for verifying accuracy of judgment result, electronic device and medium
CN110855648A (zh) 一种网络攻击的预警控制方法及装置
WO2020232902A1 (zh) 异常对象识别方法、装置、计算设备和存储介质
US20230004979A1 (en) Abnormal behavior detection method and apparatus, electronic device, and computer-readable storage medium
WO2021196935A1 (zh) 数据校验方法、装置、电子设备和存储介质
WO2021196474A1 (zh) 用户兴趣画像方法及相关设备
WO2024098699A1 (zh) 实体对象的威胁检测方法、装置、设备及存储介质
WO2021143016A1 (zh) 近似数据处理方法、装置、介质及电子设备
US20220382807A1 (en) Deduplication of media files
US10394816B2 (en) Detecting product lines within product search queries
CN113379469A (zh) 一种异常流量检测方法、装置、设备及存储介质
US9813467B1 (en) Real-time alignment and processing of incomplete stream of data
US9774508B1 (en) Communication generation using sparse indicators and sensor data
US11138194B2 (en) Method of extracting relationships from a NoSQL database
CN116883181B (zh) 基于用户画像的金融服务推送方法、存储介质及服务器

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915152

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19915152

Country of ref document: EP

Kind code of ref document: A1