WO2021012522A1 - 基于图像识别的网页取证方法、装置、存储介质及服务器 - Google Patents

基于图像识别的网页取证方法、装置、存储介质及服务器 Download PDF

Info

Publication number
WO2021012522A1
WO2021012522A1 PCT/CN2019/118149 CN2019118149W WO2021012522A1 WO 2021012522 A1 WO2021012522 A1 WO 2021012522A1 CN 2019118149 W CN2019118149 W CN 2019118149W WO 2021012522 A1 WO2021012522 A1 WO 2021012522A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
element object
target
pixel
dynamic
Prior art date
Application number
PCT/CN2019/118149
Other languages
English (en)
French (fr)
Inventor
陈爽
陈源
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021012522A1 publication Critical patent/WO2021012522A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links

Definitions

  • This application belongs to the field of computer technology, and in particular relates to a web forensics method and device based on image recognition, a computer non-volatile readable storage medium and a server.
  • the embodiments of the present application provide a web page forensics method, device, computer non-volatile readable storage medium and server based on image recognition to solve the problem of evidence collected from dynamic web content through screenshots or photos. Convincing is extremely low, and it is difficult to be accepted by the court in the litigation process.
  • the first aspect of the embodiments of the present application provides a web page forensics method based on image recognition, which may include:
  • An evidence image is selected from each frame image of the image sequence, and the evidence image is a frame image whose similarity with the target image is greater than a preset similarity threshold.
  • the second aspect of the embodiments of the present application provides a webpage forensics device, which may include modules that implement the steps of the webpage forensics method.
  • the third aspect of the embodiments of the present application provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium stores computer readable instructions, and the computer readable instructions are executed by a processor When realizing the steps of the web forensics method mentioned above.
  • the fourth aspect of the embodiments of the present application provides a server, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes the computer
  • the steps of the above webpage forensics method are realized when the instruction is read.
  • the embodiment of this application has the beneficial effect that: the embodiment of this application presets a server for web page forensics (forensic server, that is, the implementation subject of this embodiment), when the user is in a dynamic webpage After finding the content that can be used as evidence in the content, you can intercept the image of the evidence from the dynamic content of the webpage, that is, the target image, and then send a webpage forensics request to the forensic server through your own terminal device, and the webpage forensic request It includes the uniform resource locator and target image of the webpage.
  • forensic server that is, the implementation subject of this embodiment
  • the forensic server After receiving the forensic request of the webpage, the forensic server can first obtain the webpage according to the uniform resource locator therein, select dynamic element objects from the webpage, and collect all According to the image sequence of the dynamic element object, the similarity between each frame image of the image sequence and the target image is calculated, and the evidence image that can be used as evidence is selected accordingly. Since the process of obtaining evidence for the evidence image is not completed by the user, but by the forensic server, the credibility of the evidence is greatly improved, so that it can be accepted by the court during the litigation process.
  • FIG. 1 is a flowchart of an embodiment of a method for web page forensics based on image recognition in an embodiment of the application;
  • Figure 2 is a schematic flowchart of a specific implementation of selecting dynamic element objects from a target webpage
  • Figure 3 is a schematic flow chart of another specific implementation of selecting dynamic element objects from a target webpage
  • FIG. 4 is a structural diagram of an embodiment of a webpage forensics device in an embodiment of the application.
  • Fig. 5 is a schematic block diagram of a server in an embodiment of the application.
  • an embodiment of a method for web page forensics based on image recognition in an embodiment of the present application may include:
  • Step S101 Receive a web page forensics request sent by a terminal device.
  • the webpage forensics request includes the uniform resource locator and the target image of the target webpage.
  • a server for web page forensics is preset, which is hereinafter referred to as a forensic server.
  • the forensic server is the implementation subject of this embodiment and is also the core of the entire forensic system.
  • the forensic server can be set up by the court or other units or organizations authorized by the court.
  • the forensics system can provide users with platform interfaces such as applications (APP), web pages, social platform official accounts, etc. Users can use the forensics after registering on any platform interface through mobile phones, tablets, computers and other terminal devices Web forensics services provided by the system.
  • APP applications
  • the user can use the webpage as a target webpage, and send a webpage forensics request to the forensic server through his own terminal device.
  • the user can first find the page for submitting the web forensics request in the platform interface provided by the forensic system, and fill in the uniform resource locator (URL) of the target webpage in the designated area of the page, where the URL is
  • URL uniform resource locator
  • a concise representation of the location and access method of resources available on the Internet is the address of a standard resource on the Internet.
  • the URL of the current webpage will be displayed in the address bar of the browser, and the user can directly copy the URL from the address bar.
  • the user opens the target webpage locally and monitors the changes of the dynamic content of the target webpage.
  • a screenshot of the evidence content is taken locally to obtain the target image.
  • click the submit button to send a forensic request to the forensic server.
  • the forensic request carries the user's identity information, the URL of the target webpage, and the target image.
  • Step S102 Extract the uniform resource locator from the webpage forensics request, and obtain the target webpage according to the uniform resource locator.
  • the forensic server After the forensic server receives the web page forensics request, it can extract the URL of the target webpage from it, open the browser locally, and enter the URL in the address bar of the browser, so as to retrieve the target webpage from the web server storing the target webpage. Obtain the target webpage in, and display its content in the browser.
  • Step S103 Select a dynamic element object from the target webpage.
  • the forensic server Since the forensic server is to take a screenshot of the dynamic content in the web page, the forensic server must first identify the area where the dynamic content is located in the web page (that is, the screenshot area), considering that the dynamic content area is constantly changing , While other areas are basically unchanged, the forensic server can identify the screenshot area in the target webpage based on this feature.
  • each component input box, text, picture, FLASH
  • the target webpage can be read, and each element object in the target webpage can be determined.
  • testing tool when a testing tool is used to analyze the coding of a webpage, the testing tool can be used to load the target webpage and determine the target element to be tested in the target webpage. It is also possible to call the browser to load the target webpage, inject a script into the target webpage, and analyze the coding of the target webpage through the injected script.
  • each element in the target webpage is usually represented as a tree-like data structure.
  • Each element in the webpage uniquely corresponds to a node in the tree-like structure, and the nodes in the tree-like structure can have Some attribute information, such as Name attribute, ID attribute, TagName attribute, etc.
  • the attribute information can include a unique identification information, such as an ID attribute; in a written standard web page file, if the element object corresponds to a unique Name attribute, the Name attribute can also be used as the identification information. That is, the aforementioned identification information can uniquely identify the corresponding point node, and also uniquely identify the corresponding element object.
  • step S103 may include the specific process shown in FIG. 2:
  • Step S201 Collect N frames of images of the m-th element object in the target webpage.
  • Each pixel in the static element object basically does not change, while the pixel in the dynamic element object is in constant change. Therefore, it can be every certain period of time (for example, 0.2 seconds, 0.5 seconds, 1 Seconds, 2 seconds, etc.) That is, one frame of image of the m-th element object in the target webpage is collected, N times are collected in total, and N frames of images are collected, where N is an integer greater than 1. It is judged whether the m-th element object in the target webpage is a dynamic element object by evaluating the change of each frame of image.
  • Step S202 Obtain the pixel value of each pixel of the m-th element object in each frame of image.
  • Step S203 Calculate the cumulative change amount of the pixel value of the m-th element object.
  • n is the serial number of each frame image of the m-th element object, 1 ⁇ n ⁇ N
  • p is the serial number of each pixel of the m-th element object
  • 1 ⁇ p ⁇ PixNum PixNum is the m-th element object
  • the total number of pixels (Red n,p ,Blue n,p ,Green n,p ) is the pixel value of the p-th pixel of the m-th element object in the nth frame of image
  • Red n,p , Blue n, p , Green n, p are the red component, blue component and green component of the pixel value of the p-th pixel of the m-th element object in the n-th frame image respectively
  • ChgVal is the accumulation of the pixel value of the m-th element object The amount of change.
  • Step S204 Determine the attribute of the m-th element object.
  • the m-th element object can be selected as the dynamic element object.
  • the cumulative change in the pixel value of the m-th element object is less than If it is equal to the first threshold, it can be regarded as a static element object.
  • the specific value of the first threshold may be set according to actual conditions, for example, it may be set to 10, 20, 50 or other values.
  • step S103 may include the specific process shown in FIG. 3:
  • Step S301 Collect N frames of images of the m-th element object in the target webpage.
  • Step S302 Obtain the pixel value of each pixel of the m-th element object in each frame of image.
  • step S301 is the same as step S201
  • step S302 is the same as step S202.
  • step S301 is the same as step S201
  • step S302 is the same as step S202.
  • Step S303 Calculate the pixel value change amount of each pixel of the m-th element object between adjacent frame images.
  • the pixel value change amount of each pixel of the m-th element object between adjacent frame images can be calculated according to the following formula:
  • ChgPixVal n,p (Red n+1,p -Red n,p ) 2 +(Blue n+1,p -Blue n,p ) 2 +(Green n+1,p -Green n,p ) 2
  • ChgPixVal n,p is the pixel value change amount of the p-th pixel of the m-th element object between the n-th frame image and the n+1-th frame image.
  • Step S304 Count the number of dynamic pixels between adjacent frame images.
  • the dynamic pixel points between the image of the nth frame and the image of the n+1th frame are pixels whose pixel value changes are greater than a preset change threshold.
  • the specific value of the change threshold can be set according to actual conditions, for example, it can be set to 0, 1, 2 or other values.
  • Step S305 Calculate the cumulative number of dynamic pixels of the m-th element object.
  • the cumulative number of dynamic pixels of the m-th element object can be calculated according to the following formula:
  • ChgPixNum n is the number of dynamic pixels between the image of the nth frame and the image of the n+1th frame
  • ChgPixTN is the cumulative number of dynamic pixels of the mth element object.
  • Step S306 Determine the attribute of the m-th element object.
  • the m-th element object can be selected as the dynamic element object.
  • the number of dynamic pixels of the m-th element object is accumulated If the number is less than or equal to the second threshold, it can be regarded as a static element object.
  • the specific value of the second threshold can be set according to actual conditions.
  • the value of the second threshold can be set according to the following formula:
  • Thresh is the second threshold
  • is a preset proportional coefficient, 0 ⁇ 1, which can be set to 0.0001, 0.001, 0.01 or other values according to actual conditions.
  • Step S104 Collect the image sequence of the dynamic element object.
  • an image can be collected every certain time period (for example, 0.2 second, 0.5 second, 1 second, 2 second, etc.), so that the dynamic element object can be obtained Image sequence.
  • Step S105 Extract the target image from the web page forensics request, and calculate the similarity between each frame image of the image sequence and the target image.
  • the comparison method commonly used in the prior art is generally to extract the feature vector in the image through the LBP algorithm, the SIFT algorithm and other similar algorithms, and the feature vector The similarity between the images is taken as the similarity between the images. Because the feature vector extraction process involves a lot of calculations, it consumes a lot of resources and time. Since the more similar images are, the pixel value distributions will be more similar. In this embodiment, it is preferable to perform similarity calculation by statistics of the pixel value distribution.
  • the distribution ratio of the pixel points of each color component in the target image can be calculated according to the following formula:
  • PN1 is the total number of pixels in the target image
  • StRPixNum pv is the total number of pixels in the target image whose red component is pv
  • StBPixNum pv is the pixel in the target image whose blue component is pv
  • the total number of points StGPixNum pv is the total number of pixels in the target image whose green component is pv
  • StRRatio pv is the distribution ratio of pixels in the target image whose red component is pv
  • StBRatio pv is the The distribution ratio of pixels in the target image whose blue component is pv
  • StGRatio pv is the distribution ratio of pixels in the target image whose green component is pv
  • PVMax is the maximum pixel value Value, generally, the value of PVMax is 255.
  • the distribution ratio of the pixel points of each color component in the nth frame of the image sequence can be calculated according to the following formula:
  • PN2 is the total number of pixels in the nth frame of the image sequence
  • CdRPixNum pv is the total number of pixels in the nth frame whose red component is pv
  • CdBPixNum pv is the blue component in the nth frame.
  • CdGPixNum pv is the total number of pixels whose green component is pv in the nth frame image
  • CdRRatio pv is the distribution ratio of pixels whose red component is pv in the nth frame image
  • CdBRatio pv is the distribution ratio of pixels with the blue component value of pv in the nth frame of image
  • CdGRatio pv is the distribution ratio of pixels with the green component value of pv in the nth frame of image.
  • Step S106 Select evidence images from each frame of the image sequence.
  • the evidence image is a frame of image whose similarity with the target image is greater than a preset similarity threshold.
  • the specific value of the similarity threshold may be set according to actual conditions, for example, it may be set to 0.9, 0.95, 0.98 or other values.
  • Timestamp refers to the total number of seconds since January 01, 1970, 00: 00: 00 seconds, GMT (Beijing time, January 01, 1970, 08: 00: 00 seconds) to the present. It can be expressed A piece of data is complete, verifiable data that already exists before a certain time, usually a sequence of characters that uniquely identifies a certain moment of time.
  • the forensic server performs a hash operation on the evidence image to obtain a hash value corresponding to the evidence image.
  • the hash operation transforms an input of any length into a fixed-length output, and the output is the hash value.
  • This conversion is a compression mapping, that is, the length of the output is usually much smaller than the length of the input, different inputs may be hashed into the same output, and it is impossible to uniquely determine the input value from the output value. Simply put, it is a process of compressing messages of any length to a fixed-length message digest.
  • the hash operation used in this embodiment may include, but is not limited to, specific algorithms such as MD4, MD5, and SHA1.
  • the forensic server sends the hash value to the timing system.
  • the time service system should be a legally valid time service system certified by a court.
  • a joint trust time stamp service center is the national time service center of the Chinese Academy of Sciences and Beijing United Trust Technology Service Co., Ltd. is responsible for the construction of my country's third-party trusted timestamp authentication service.
  • the National Time Service Center is responsible for time service and punctual monitoring. Due to its punctual monitoring function, it guarantees the accuracy of the time in the time stamp certificate and is not tampered with.
  • the forensic server receives the time stamp certificate of the evidence image fed back by the timing system, and adds the time stamp certificate to the evidence image to obtain a stamped evidence image.
  • the time stamp certificate of the evidence image is data obtained after the time service system digitally signs the hash value and the system time. After the time service system receives the hash value of the evidence, it adds the timestamp when the hash value was received, and then digitally signs the whole to obtain the timestamp certificate of the evidence image, and finally The obtained time stamp certificate is sent to the forensic server.
  • the forensic server can also upload the stamped evidence image to a designated blockchain system, which should be a legally valid block certified by the court A chain system.
  • the block chain system can be a public chain, a consortium chain, or a private chain.
  • the block chain system usually includes multiple nodes, and the forensic server in this embodiment is one of the write nodes.
  • the forensic server uploads the stamped evidence image to the blockchain system, and each node in the blockchain system obtains the writing authority of the evidence by setting a consensus mechanism, where the setting consensus mechanism includes but not Limited to specific mechanisms such as POW, POS, DPOS, PBFT, sequential rotation mechanism or random selection mechanism.
  • the node that has obtained the write permission sends the evidence to each node in the blockchain system in the form of a block, so that each node verifies the block, and if the verification passes, the block is stored in the blockchain Up; if verification fails, delete the block.
  • each node in the blockchain system records evidence information together, which cannot be tampered with, and is endorsed together. Its credibility and transparency are higher than the credibility of a single endorsement by the government.
  • the stamped evidence image will be obtained from the blockchain system through the terminal device designated by the court, and Show in court.
  • this embodiment of the present application is preset with a forensic server for web page forensics.
  • a user finds content that can be used as evidence in the dynamic content of a web page, he can intercept the content from the dynamic content of the web page.
  • the image of the evidence that is, the target image, then sends a web page forensics request to the forensic server through its own terminal device.
  • the web page forensic request includes the uniform resource locator of the web page and the target image, and the forensic server receives the web page forensic request After that, the web page can be obtained first according to the uniform resource locator therein, the dynamic element object is selected from the web page, and the image sequence of the dynamic element object is collected, and then each frame image of the image sequence and the Based on the similarity between the target images, the evidence images that can be used as evidence are selected. Since the process of obtaining evidence for the evidence image is not completed by the user, but by the forensic server, the credibility of the evidence is greatly improved, so that it can be accepted by the court during the litigation process.
  • FIG. 4 shows a structural diagram of an embodiment of a web page forensics device provided in an embodiment of the present application.
  • a webpage forensics device may include:
  • the forensic request receiving module 401 is configured to receive a web forensic request sent by a terminal device, where the web forensic request includes the uniform resource locator and the target image of the target webpage;
  • the target webpage obtaining module 402 is configured to extract the uniform resource locator from the webpage forensics request, and obtain the target webpage according to the uniform resource locator;
  • the dynamic element object selection module 403 is used to select dynamic element objects from the target webpage
  • the image sequence acquisition module 404 is used to acquire the image sequence of the dynamic element object
  • the similarity calculation module 405 is configured to extract the target image from the web forensics request, and calculate the similarity between each frame image of the image sequence and the target image;
  • the evidence image selection module 406 is configured to select an evidence image from each frame image of the image sequence, and the evidence image is a frame image whose similarity with the target image is greater than a preset similarity threshold.
  • the dynamic element object selection module may include:
  • An image acquisition unit for acquiring N frames of images of the m-th element object in the target webpage, 1 ⁇ m ⁇ M, where M is the total number of element objects in the target webpage;
  • the pixel value obtaining unit is used to obtain the pixel value of each pixel of the m-th element object in each frame image
  • a pixel value cumulative change calculation unit for calculating the cumulative change of the pixel value of the m-th element object
  • the first selecting unit is configured to select the m-th element object as the dynamic element object if the cumulative change amount of the pixel value of the m-th element object is greater than the preset first threshold.
  • the dynamic element object selection module may further include:
  • the pixel value change calculation unit is used to calculate the pixel value change of each pixel of the m-th element object between adjacent frame images;
  • the dynamic pixel point statistical unit is used to separately count the number of dynamic pixels between adjacent frame images.
  • the dynamic pixel point between the nth frame image and the n+1th frame image is that the change in pixel value is greater than Pixels with preset threshold value of change;
  • Cumulative number calculation unit for calculating the cumulative number of dynamic pixels of the m-th element object
  • the second selecting unit is configured to select the m-th element object as the dynamic element object if the cumulative number of dynamic pixel points of the m-th element object is greater than the preset second threshold.
  • the similarity calculation module may include:
  • the first distribution ratio calculation unit is used to calculate the distribution ratio of the pixel points of each color component in the target image
  • the second distribution ratio calculation unit is used to calculate the distribution ratio of the pixel points of each color component in each frame image of the image sequence
  • the similarity calculation unit is configured to calculate the similarity between the nth frame image of the image sequence and the target image.
  • Fig. 5 shows a schematic block diagram of a server provided by an embodiment of the present application. For ease of description, only the parts related to the embodiment of the present application are shown.
  • the server 5 may include: a processor 50, a memory 51, and computer-readable instructions 52 stored in the memory 51 and executable on the processor 50, such as executing the above-mentioned image-based Computer-readable instructions for the identified web forensics method.
  • the processor 50 executes the computer-readable instructions 52, the steps in the foregoing embodiments of the web forensics method based on image recognition are implemented, such as steps S101 to S106 shown in FIG. 1.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种基于图像识别的网页取证方法、装置、计算机非易失性可读存储介质及服务器,方法包括:接收终端设备发送的网页取证请求(S101),所述网页取证请求中包括目标网页的统一资源定位符和目标图像;从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页(S102);从所述目标网页中选取动态元素对象(S103),并采集所述动态元素对象的图像序列(S104);从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度(S105);从所述图像序列的各帧图像中选取出证据图像(S106),所述证件图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。由于对证据图像的取证过程是由取证服务器完成,大大提升了该证据的信服力,从而可以在诉讼过程中被法庭所接受。

Description

基于图像识别的网页取证方法、装置、存储介质及服务器
本申请要求于2019年7月19日提交中国专利局、申请号为201910652650.7、发明名称为“基于图像识别的网页取证方法、装置、存储介质及服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于计算机技术领域,尤其涉及一种基于图像识别的网页取证方法、装置、计算机非易失性可读存储介质及服务器。
背景技术
随着互联网技术的普及,越来越多的信息内容由平面印刷品转移到了互联网的网页中去,在海量的网页信息中包含了众多的可用于司法诉讼的证据内容,例如,目前的网页中往往会存在大量的动图、FLASH动画、视频等动态内容,例如,商家在其网店的页面中可能设置了FLASH动画的方式进行宣传营销,其中包含了某些可用于诉讼的证据内容。这些证据很容易通过截图或者拍照等方式采集到,但是网页极易被修改及删除,在原始的网页已不存在的情况下,受害人自己通过截图或者拍照从网页动态内容中采集到的证据的信服力极低,很难在诉讼过程中被法庭接受。
技术问题
有鉴于此,本申请实施例提供了一种基于图像识别的网页取证方法、装置、计算机非易失性可读存储介质及服务器,以解决通过截图或者拍照从网页动态内容中采集到的证据的信服力极低,很难在诉讼过程中被法庭接受的问题。
技术解决方案
本申请实施例的第一方面提供了一种基于图像识别的网页取证方法,可以包括:
接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
从所述目标网页中选取动态元素对象,并采集所述动态元素对象的图像序列;
从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
本申请实施例的第二方面提供了一种网页取证装置,可以包括实现上述网页取证方法的步骤的模块。
本申请实施例的第三方面提供了一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述网页取证方法的步骤。
本申请实施例的第四方面提供了一种服务器,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述网页取证方法的步骤。
有益效果
本申请实施例与现有技术相比存在的有益效果是:本申请实施例预先设置了用于网页取证的服务器(取证服务器,即本实施例的实施主体),当用户在某个网页的动态内容中发现了可用于作为证据的内容后,可以从该网页的动态内容中截取该证据的图像,也即目标图像,然后通过自己的终端设备向取证服务器发送网页取证请求,在该网页取证请求中包括该网页的统一资源定位符和目标图像,取证服务器在接收到该网页取证请求后,可以首先根据其中的统一资源定位符获取到该网页,从该网页中选取动态元素对象,并采集所述动态元素对象的图像序列,再分别计算所述图像序列的各帧图像与所述目标图像之间的相似度,并据此选取出可作为证据的证据图像。由于对证据图像的取证过程并非由用户完成,而是由取证服务器完成,大大提升了该证据的信服力,从而可以在诉讼过程中被法庭所接受。
附图说明
图1为本申请实施例中一种基于图像识别的网页取证方法的一个实施例流程图;
图2为从目标网页中选取动态元素对象的一种具体实现方式的示意流程图;
图3为从目标网页中选取动态元素对象的另一种具体实现方式的示意流程图;
图4为本申请实施例中一种网页取证装置的一个实施例结构图;
图5为本申请实施例中一种服务器的示意框图。
本发明的实施方式
请参阅图1,本申请实施例中一种基于图像识别的网页取证方法的一个实施例可以包括:
步骤S101、接收终端设备发送的网页取证请求。
所述网页取证请求中包括目标网页的统一资源定位符和目标图像。
本实施例中预先设置了用于网页取证的服务器,以下将其称为取证服务器,该取证服务器为本实施例的实施主体,也是整个取证系统的核心。该取证服务器可以由法院设置,也可以由经法院授权的其它单位或组织设置。取证系统可以为用户提供应用程序(APP)、网页、社交平台公众号等等途径的平台接口,用户通过手机、平板、电 脑等终端设备在任意一个平台接口上进行注册后,即可使用该取证系统提供的网页取证服务。
由于本实施例主要应用于法律诉讼的场景,为了后续诉讼相关的需求,需要获得证据提供人的真实身份信息,因此,用户在使用该取证系统之前,需要首先通过实名认证,提供身份证件进行查验,预留电话号码、邮箱等联系方式以备后续沟通。
当用户在某个网页的动态内容中发现了可用于作为证据的内容后,即可将该网页作为目标网页,并通过自己的终端设备向所述取证服务器发送网页取证请求。具体地,用户可以首先在取证系统提供的平台接口中找到提交网页取证请求的页面,并在该页面中的指定区域填写目标网页的统一资源定位符(Uniform Resource Locator,URL),其中,URL是对可以从互联网上得到的资源的位置和访问方法的一种简洁的表示,是互联网上标准资源的地址。一般地,当用户通过浏览器浏览网页时,在浏览器的地址栏中都会显示当前网页的URL,用户可以从地址栏中直接复制得到该URL。用户在本地打开目标网页,并监控目标网页的动态内容的变化,当用户在其中发现了可用于诉讼的证据内容时,在本地对该证据内容进行截图,从而得到所述目标图像。当用户完成相关信息的填写后,点击提交按钮,即可向取证服务器发送取证请求,该取证请求中携带了用户的身份信息、目标网页的URL、以及目标图像。
步骤S102、从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页。
所述取证服务器在接收到所述网页取证请求后,可以从中提取出目标网页的URL,在本地打开浏览器,在浏览器的地址栏中输入该URL,从而从存储着该目标网页的网页服务器中获取到该目标网页,并将其内容显示在浏览器中。
步骤S103、从所述目标网页中选取动态元素对象。
由于取证服务器是要对网页中的动态内容进行证据截图,那么取证服务器首先要在网页中识别出动态内容所在的区域(也即截图区域),考虑到动态内容区域是在不断的变化之中的,而其它区域基本是没有变化的,取证服务器可以根据这一特点在目标网页中识别出截图区域。
在HTML体系中,组成一个页面的各个组件(输入框、文本、图片、FLASH)都是其中的一个元素对象。在本实施例中,可以读取目标网页,并确定目标网页中的各个元素对象。
具体实现该步骤时,可以根据实际应用环境的不同采取不同的方式。例如,当以测试工具对网页编码进行分析时,可以使用测试工具加载目标网页,并确定目标网页中待测的目标元素。也可以调用浏览器加载目标网页,通过向目标网页中注入脚本, 并通过所注入的脚本来对目标网页的编码进行分析。
在目标网页加载的过程中,目标网页中的各个元素通常会表示为树状的数据结构,网页中的各个元素唯一与树状结构中的一个节点相对应,而树状结构中的节点可以具有一些属性信息,例如,Name属性、ID属性、TagName属性等。在这些属性信息中可以包括一个唯一的标识信息,如ID属性;在书写规范的网页文件中,元素对象如果对应唯一的Name属性,该Name属性也可以作为标识信息。也即上述标识信息能够唯一标识对应点节点,同时也唯一标识了对应的元素对象。
在本实施例的一种具体实现方式中,步骤S103可以包括图2所示的具体过程:
步骤S201、采集所述目标网页中第m个元素对象的N帧图像。
其中,1≤m≤M,M为所述目标网页中元素对象的总数。
静态元素对象中的各个像素点基本是不会变化的,而动态元素对象中的像素点则是处于不断的变化之中,因此,可以每隔一定的时长(例如,0.2秒、0.5秒、1秒、2秒等等)即采集所述目标网页中第m个元素对象的一帧图像,总共进行N次采集,采集到N帧图像,N为大于1的整数。通过评估各帧图像的变化情况来判断所述目标网页中第m个元素对象是否为动态元素对象。
步骤S202、分别获取第m个元素对象的各个像素点在各帧图像中的像素值。
步骤S203、计算第m个元素对象的像素值累积变化量。
例如,根据下式计算第m个元素对象的像素值累积变化量:
Figure PCTCN2019118149-appb-000001
其中,n为第m个元素对象的各帧图像的序号,1≤n≤N,p为第m个元素对象的各个像素点的序号,1≤p≤PixNum,PixNum为第m个元素对象的像素点总数,(Red n,p,Blue n,p,Green n,p)为第m个元素对象的第p个像素点在第n帧图像中的像素值,Red n,p、Blue n,p、Green n,p分别为第m个元素对象的第p个像素点在第n帧图像中的像素值的红色分量、蓝色分量和绿色分量,ChgVal为第m个元素对象的像素值累积变化量。
步骤S204、确定第m个元素对象的属性。
若第m个元素对象的像素值累积变化量大于预设的第一阈值,则可选取第m个元素对象作为所述动态元素对象,反之,若第m个元素对象的像素值累积变化量小于或等于所述第一阈值,则可将其作为静态元素对象。所述第一阈值的具体取值可以根据实际情况进行设置,例如,可以将其设置为10、20、50或者其它取值。
在本实施例的另一种具体实现方式中,步骤S103可以包括图3所示的具体过程:
步骤S301、采集所述目标网页中第m个元素对象的N帧图像。
步骤S302、分别获取第m个元素对象的各个像素点在各帧图像中的像素值。
其中,步骤S301与步骤S201相同,步骤S302与步骤S202相同,具体可参照前述详细叙述,此处不再赘述。
步骤S303、计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量。
例如,可以根据下式计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量:
ChgPixVal n,p=(Red n+1,p-Red n,p) 2+(Blue n+1,p-Blue n,p) 2+(Green n+1,p-Green n,p) 2
其中,ChgPixVal n,p为第m个元素对象的第p个像素点在第n帧图像与第n+1帧图像之间的像素值变化量。
步骤S304、分别统计在相邻帧图像之间的动态像素点的个数。
其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点。所述变化量阈值的具体取值可以根据实际情况进行设置,例如,可以将其设置为0、1、2或者其它取值。
步骤S305、计算第m个元素对象的动态像素点累积个数。
例如,可以根据下式计算第m个元素对象的动态像素点累积个数:
Figure PCTCN2019118149-appb-000002
其中,ChgPixNum n为第n帧图像与第n+1帧图像之间的动态像素点的个数,ChgPixTN为第m个元素对象的动态像素点累积个数。
步骤S306、确定第m个元素对象的属性。
若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则可选取第m个元素对象作为所述动态元素对象,反之,若第m个元素对象的动态像素点累积个数小于或等于所述第二阈值,则可将其作为静态元素对象。所述第二阈值的具体取值可以根据实际情况进行设置,优选地,可以根据下式设置所述第二阈值的取值:
Thresh=ω×N×PixNum
其中,Thresh为所述第二阈值,ω为预设的比例系数,0<ω<1,可以根据实际情况将其设置为0.0001、0.001、0.01或者其它取值。
步骤S104、采集所述动态元素对象的图像序列。
当从目标网页中选取出动态元素对象之后,可以每隔一定的时长(例如,0.2秒、0.5秒、1秒、2秒等等)对其进行一次图像采集,从而可以得到所述动态元素对象的图像序列。
步骤S105、从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度。
考虑到本实施例中可能会涉及到多次的图像比对,而现有技术中常用的比对方式一般是通过LBP算法、SIFT算法以及其它类似算法提取图像中的特征向量,并将特征向量之间的相似度作为图像之间的相似度,由于特征向量的提取过程会涉及到大量的计算,消耗大量的资源和时间。由于越相近的图像,其像素值分布的情况也会越相似,在本实施例中优选通过对像素值分布的统计来进行相似度计算。
首先,计算所述目标图像中各个颜色分量取值的像素点的分布比率,并分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率。
例如,可以根据下式计算所述目标图像中各个颜色分量取值的像素点的分布比率:
Figure PCTCN2019118149-appb-000003
其中,PN1为所述目标图像的像素点总数,StRPixNum pv为所述目标图像中红色分量取值为pv的像素点的总数,StBPixNum pv为所述目标图像中蓝色分量取值为pv的像素点的总数,StGPixNum pv为所述目标图像中绿色分量取值为pv的像素点的总数,StRRatio pv为所述目标图像中红色分量取值为pv的像素点的分布比率,StBRatio pv为所述目标图像中蓝色分量取值为pv的像素点的分布比率,StGRatio pv为所述目标图像中绿色分量取值为pv的像素点的分布比率,0≤pv≤PVMax,PVMax为像素值的最大取值,一般地,PVMax的取值为255。
类似地,可以根据下式计算所述图像序列的第n帧图像中各个颜色分量取值的像素点的分布比率:
Figure PCTCN2019118149-appb-000004
其中,PN2为所述图像序列的第n帧图像的像素点总数,CdRPixNum pv为第n帧图像中红色分量取值为pv的像素点的总数,CdBPixNum pv为第n帧图像中蓝色分量取值为pv的像素点的总数,CdGPixNum pv为第n帧图像中绿色分量取值为pv的像素点的总数,CdRRatio pv为第n帧图像中红色分量取值为pv的像素点的分布比率,CdBRatio pv为第n帧图像中蓝色分量取值为pv的像素点的分布比率,CdGRatio pv为第n帧图像中绿 色分量取值为pv的像素点的分布比率。
然后,根据下式计算所述图像序列的第n帧图像与所述目标图像之间的相似度:
Figure PCTCN2019118149-appb-000005
其中,DiffRatio pv=(StRRatio pv-CdRRatio pv) 2+(StBRatio pv-CdBRatio pv) 2+(StGRatio pv-CdGRatio pv) 2,SimDeg为所述图像序列的第n帧图像与所述目标图像之间的相似度,从中可以看出,两个图像的像素值分布的情况越相似,则两者之间的相似度也越高。
步骤S106、从所述图像序列的各帧图像中选取出证据图像。
所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。所述相似度阈值的具体取值可以根据实际情况进行设置,例如,可以将其设置为0.9、0.95、0.98或者其它取值。
进一步地,取证服务器在采集到证据图像后,可以通过授时系统为其添加上时间戳,从而表明该证据在当前时间点是存在的。时间戳(timestamp)是指格林威治时间1970年01月01日00时00分00秒(北京时间1970年01月01日08时00分00秒)起至现在的总秒数,是能表示一份数据在某个特定时间之前已经存在的、完整的、可验证的数据,通常是一个字符序列,唯一地标识某一刻的时间。
首先,取证服务器对所述证据图像进行哈希运算,得到与所述证据图像对应的哈希值。
哈希运算是把任意长度的输入变换成固定长度的输出,该输出就是哈希值。这种转换是一种压缩映射,也就是,输出的长度通常远小于输入的长度,不同的输入可能会散列成相同的输出,而不可能从输出值来唯一的确定输入值。简单的说就是一种将任意长度的消息压缩到某一固定长度的消息摘要的过程。在本实施例中所使用的哈希运算可以包括但不限于MD4、MD5、SHA1等具体的算法。
然后,取证服务器将所述哈希值发送至所述授时系统。
所述授时系统应为经过法庭认证的具有法律效力的授时系统,在本实施例中,优选采用联合信任时间戳服务中心来提供时间戳服务,联合信任时间戳服务中心是我国中科院国家授时中心与北京联合信任技术服务有限公司负责建设的我国第三方可信时间戳认证服务。由国家授时中心负责时间的授时与守时监测。因其守时监测功能而保障时间戳证书中的时间的准确性和不被篡改。
最后,取证服务器接收所述授时系统反馈的所述证据图像的时间戳证书,并将该时间戳证书添加入所述证据图像中,得到加戳后的证据图像。
所述证据图像的时间戳证书为所述授时系统对所述哈希值和系统时间进行数字签名后得到的数据。所述授时系统在接收到证据的哈希值后,添加入接收到该哈希值时的时间戳,然后对这一整体进行数字签名,从而得到所述证据图像的时间戳证书,并将最终所得的时间戳证书发送至所述取证服务器。
进一步地,为了保证证据的安全性,取证服务器还可以将所述加戳后的证据图像上传到指定的区块链系统中,该区块链系统应为经过法庭认证的具有法律效力的区块链系统,该区块链系统可以是公有链、联盟链或私有链,区块链系统通常都会包括多个节点,本实施例中的取证服务器即为其中的一个写入节点。
取证服务器将所述加戳后的证据图像上传至区块链系统中,该区块链系统中的各个节点通过设定共识机制获取该证据的写入权限,其中,设定共识机制包括但不限于POW、POS、DPOS、PBFT、顺序轮换机制或随机选择机制等等具体机制。取得写入权限的节点将该证据以区块的形式发送给区块链系统中的各个节点,以使得各个节点对该区块进行验证,如果验证通过,则将该区块存储至区块链上;如果验证失败,则将该区块删除。
如果区块没有在区块链系统中得到确认,则会向取证服务器反馈失败结果。相反,若区块得到确认并存储,则会向取证服务器反馈成功结果,以保证向区块链系统中进行存储的信息的状态是明确的,不会发生数据丢失的问题。由于区块链分布式存储的特点,区块链系统中的各个节点共同记录证据信息,不可篡改,共同背书,其公信力和透明性要高于政府单一背书的公信力。
在诉讼过程中,若用户需要向法庭展示相关证据,则可向法庭提出申请,法庭审核批准后,会通过法庭指定的终端设备从区块链系统中获取所述加戳后的证据图像,并在法庭中进行展示。
综上所述,本申请实施例预先设置了用于网页取证的取证服务器,当用户在某个网页的动态内容中发现了可用于作为证据的内容后,可以从该网页的动态内容中截取该证据的图像,也即目标图像,然后通过自己的终端设备向取证服务器发送网页取证请求,在该网页取证请求中包括该网页的统一资源定位符和目标图像,取证服务器在接收到该网页取证请求后,可以首先根据其中的统一资源定位符获取到该网页,从该网页中选取动态元素对象,并采集所述动态元素对象的图像序列,再分别计算所述图像序列的各帧图像与所述目标图像之间的相似度,并据此选取出可作为证据的证据图像。由于对证据图像的取证过程并非由用户完成,而是由取证服务器完成,大大提升了该证据的信服力,从而可以在诉讼过程中被法庭所接受。
对应于上文实施例所述的一种基于图像识别的网页取证方法,图4示出了本申请 实施例提供的一种网页取证装置的一个实施例结构图。
本实施例中,一种网页取证装置可以包括:
取证请求接收模块401,用于接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
目标网页获取模块402,用于从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
动态元素对象选取模块403,用于从所述目标网页中选取动态元素对象;
图像序列采集模块404,用于采集所述动态元素对象的图像序列;
相似度计算模块405,用于从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
证据图像选取模块406,用于从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
可选地,所述动态元素对象选取模块可以包括:
图像采集单元,用于采集所述目标网页中第m个元素对象的N帧图像,1≤m≤M,M为所述目标网页中元素对象的总数;
像素值获取单元,用于分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
像素值累积变化量计算单元,用于计算第m个元素对象的像素值累积变化量;
第一选取单元,用于若第m个元素对象的像素值累积变化量大于预设的第一阈值,则选取第m个元素对象作为所述动态元素对象。
可选地,所述动态元素对象选取模块还可以包括:
像素值变化量计算单元,用于计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量;
动态像素点统计单元,用于分别统计在相邻帧图像之间的动态像素点的个数,其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点;
累积个数计算单元,用于计算第m个元素对象的动态像素点累积个数;
第二选取单元,用于若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则选取第m个元素对象作为所述动态元素对象。
进一步地,所述相似度计算模块可以包括:
第一分布比率计算单元,用于计算所述目标图像中各个颜色分量取值的像素点的分布比率;
第二分布比率计算单元,用于分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率;
相似度计算单元,用于计算所述图像序列的第n帧图像与所述目标图像之间的相似度。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置,模块和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
图5示出了本申请实施例提供的一种服务器的示意框图,为了便于说明,仅示出了与本申请实施例相关的部分。
在本实施例中,所述服务器5可以包括:处理器50、存储器51以及存储在所述存储器51中并可在所述处理器50上运行的计算机可读指令52,例如执行上述的基于图像识别的网页取证方法的计算机可读指令。所述处理器50执行所述计算机可读指令52时实现上述各个基于图像识别的网页取证方法实施例中的步骤,例如图1所示的步骤S101至S106。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机非易失性可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种基于图像识别的网页取证方法,其特征在于,包括:
    接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
    从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
    从所述目标网页中选取动态元素对象,并采集所述动态元素对象的图像序列;
    从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
    从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
  2. 根据权利要求1所述的网页取证方法,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像,1≤m≤M,M为所述目标网页中元素对象的总数;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    根据下式计算第m个元素对象的像素值累积变化量:
    Figure PCTCN2019118149-appb-100001
    其中,n为第m个元素对象的各帧图像的序号,1≤n≤N,p为第m个元素对象的各个像素点的序号,1≤p≤PixNum,PixNum为第m个元素对象的像素点总数,(Red n,p,Blue n,p,Green n,p)为第m个元素对象的第p个像素点在第n帧图像中的像素值,Red n,p、Blue n,p、Green n,p分别为第m个元素对象的第p个像素点在第n帧图像中的像素值的红色分量、蓝色分量和绿色分量,ChgVal为第m个元素对象的像素值累积变化量;
    若第m个元素对象的像素值累积变化量大于预设的第一阈值,则选取第m个元素对象作为所述动态元素对象。
  3. 根据权利要求1所述的网页取证方法,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    根据下式计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量:
    ChgPixVal n,p=(Red n+1,p-Red n,p) 2+(Blue n+1,p-Blue n,p) 2+(Green n+1,p-Green n,p) 2
    其中,ChgPixVal n,p为第m个元素对象的第p个像素点在第n帧图像与第n+1帧图像之间的像素值变化量;
    分别统计在相邻帧图像之间的动态像素点的个数,其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点;
    根据下式计算第m个元素对象的动态像素点累积个数:
    Figure PCTCN2019118149-appb-100002
    其中,ChgPixNum n为第n帧图像与第n+1帧图像之间的动态像素点的个数,ChgPixTN为第m个元素对象的动态像素点累积个数;
    若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则选取第m个元素对象作为所述动态元素对象。
  4. 根据权利要求1至3中任一项所述的网页取证方法,其特征在于,所述分别计算所述图像序列的各帧图像与所述目标图像之间的相似度包括:
    计算所述目标图像中各个颜色分量取值的像素点的分布比率,并分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率;
    根据下式计算所述图像序列的第n帧图像与所述目标图像之间的相似度:
    Figure PCTCN2019118149-appb-100003
    其中,DiffRatio pv=(StRRatio pv-CdRRatio pv) 2+(StBRatio pv-CdBRatio pv) 2+(StGRatio pv-CdGRatio pv) 2,CdRRatio pv为第n帧图像中红色分量取值为pv的像素点的分布比率,StRRatio pv为所述目标图像中红色分量取值为pv的像素点的分布比率,CdBRatio pv为第n帧图像中蓝色分量取值为pv的像素点的分布比率,StBRatio pv为所述目标图像中蓝色分量取值为pv的像素点的分布比率,CdGRatio pv为第n帧图像中绿色分量取值为pv的像素点的分布比率,StGRatio pv为所述目标图像中绿色分量取值为pv的像素点的分布比率,0≤pv≤PVMax,PVMax为像素值的最大取值,SimDeg为所述图像序列的第n帧图像与所述目标图像之间的相似度。
  5. 根据权利要求4所述的网页取证方法,其特征在于,所述计算所述目标图像中各个颜色分量取值的像素点的分布比率包括:
    根据下式计算所述目标图像中各个颜色分量取值的像素点的分布比率:
    Figure PCTCN2019118149-appb-100004
    其中,PN1为所述目标图像的像素点总数,StRPixNum pv为所述目标图像中红色分量取值为pv的像素点的总数,StBPixNum pv为所述目标图像中蓝色分量取值为pv的像素点的总数,StGPixNum pv为所述目标图像中绿色分量取值为pv的像素点的总数。
  6. 一种网页取证装置,其特征在于,包括:
    取证请求接收模块,用于接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
    目标网页获取模块,用于从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
    动态元素对象选取模块,用于从所述目标网页中选取动态元素对象;
    图像序列采集模块,用于采集所述动态元素对象的图像序列;
    相似度计算模块,用于从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
    证据图像选取模块,用于从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
  7. 根据权利要求6所述的网页取证装置,其特征在于,所述动态元素对象选取模块包括:
    图像采集单元,用于采集所述目标网页中第m个元素对象的N帧图像,1≤m≤M,M为所述目标网页中元素对象的总数;
    像素值获取单元,用于分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    像素值累积变化量计算单元,用于计算第m个元素对象的像素值累积变化量;
    第一选取单元,用于若第m个元素对象的像素值累积变化量大于预设的第一阈值,则选取第m个元素对象作为所述动态元素对象。
  8. 根据权利要求6所述的网页取证装置,其特征在于,所述动态元素对象选取模块还包括:
    像素值变化量计算单元,用于计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量;
    动态像素点统计单元,用于分别统计在相邻帧图像之间的动态像素点的个数,其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点;
    累积个数计算单元,用于计算第m个元素对象的动态像素点累积个数;
    第二选取单元,用于若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则选取第m个元素对象作为所述动态元素对象。
  9. 根据权利要求6所述的网页取证装置,其特征在于,所述相似度计算模块包括:
    第一分布比率计算单元,用于计算所述目标图像中各个颜色分量取值的像素点的分布比率;
    第二分布比率计算单元,用于分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率;
    相似度计算单元,用于计算所述图像序列的第n帧图像与所述目标图像之间的相似度。
  10. 根据权利要求9所述的网页取证装置,其特征在于,所述第一分布比率计算单元具体用于根据下式计算所述目标图像中各个颜色分量取值的像素点的分布比率:
    Figure PCTCN2019118149-appb-100005
    其中,PN1为所述目标图像的像素点总数,StRPixNum pv为所述目标图像中红色分量取值为pv的像素点的总数,StBPixNum pv为所述目标图像中蓝色分量取值为pv的像素点的总数,StGPixNum pv为所述目标图像中绿色分量取值为pv的像素点的总数。
  11. 一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:
    接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
    从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
    从所述目标网页中选取动态元素对象,并采集所述动态元素对象的图像序列;
    从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
    从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
  12. 根据权利要求11所述的计算机非易失性可读存储介质,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像,1≤m≤M,M为所述目标网页中元素对象的总数;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    计算第m个元素对象的像素值累积变化量;
    若第m个元素对象的像素值累积变化量大于预设的第一阈值,则选取第m个元素对象作为所述动态元素对象。
  13. 根据权利要求11所述的计算机非易失性可读存储介质,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量;
    分别统计在相邻帧图像之间的动态像素点的个数,其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点;
    计算第m个元素对象的动态像素点累积个数;
    若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则选取第m个元素对象作为所述动态元素对象。
  14. 根据权利要求11至13中任一项所述的计算机非易失性可读存储介质,其特征在于,所述分别计算所述图像序列的各帧图像与所述目标图像之间的相似度包括:
    计算所述目标图像中各个颜色分量取值的像素点的分布比率,并分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率;
    计算所述图像序列的第n帧图像与所述目标图像之间的相似度。
  15. 根据权利要求14所述的计算机非易失性可读存储介质,其特征在于,所述计算所述目标图像中各个颜色分量取值的像素点的分布比率包括:
    根据下式计算所述目标图像中各个颜色分量取值的像素点的分布比率:
    Figure PCTCN2019118149-appb-100006
    其中,PN1为所述目标图像的像素点总数,StRPixNum pv为所述目标图像中红色分量取值为pv的像素点的总数,StBPixNum pv为所述目标图像中蓝色分量取值为pv的像素点的总数,StGPixNum pv为所述目标图像中绿色分量取值为pv的像素点的总数。
  16. 一种服务器,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    接收终端设备发送的网页取证请求,所述网页取证请求中包括目标网页的统一资源定位符和目标图像;
    从所述网页取证请求中提取出所述统一资源定位符,并根据所述统一资源定位符获取所述目标网页;
    从所述目标网页中选取动态元素对象,并采集所述动态元素对象的图像序列;
    从所述网页取证请求中提取出所述目标图像,并分别计算所述图像序列的各帧图像与所述目标图像之间的相似度;
    从所述图像序列的各帧图像中选取出证据图像,所述证据图像为与所述目标图像之间的相似度大于预设的相似度阈值的一帧图像。
  17. 根据权利要求16所述的服务器,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像,1≤m≤M,M为所述目标网页中元素对象的总数;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    计算第m个元素对象的像素值累积变化量;
    若第m个元素对象的像素值累积变化量大于预设的第一阈值,则选取第m个元素对象作为所述动态元素对象。
  18. 根据权利要求16所述的服务器,其特征在于,所述从所述目标网页中选取动态元素对象包括:
    采集所述目标网页中第m个元素对象的N帧图像;
    分别获取第m个元素对象的各个像素点在各帧图像中的像素值;
    计算第m个元素对象的各个像素点在相邻帧图像之间的像素值变化量;
    分别统计在相邻帧图像之间的动态像素点的个数,其中,第n帧图像与第n+1帧图像之间的动态像素点为像素值变化量大于预设的变化量阈值的像素点;
    计算第m个元素对象的动态像素点累积个数;
    若第m个元素对象的动态像素点累积个数大于预设的第二阈值,则选取第m个元 素对象作为所述动态元素对象。
  19. 根据权利要求16至18中任一项所述的服务器,其特征在于,所述分别计算所述图像序列的各帧图像与所述目标图像之间的相似度包括:
    计算所述目标图像中各个颜色分量取值的像素点的分布比率,并分别计算所述图像序列的各帧图像中各个颜色分量取值的像素点的分布比率;
    计算所述图像序列的第n帧图像与所述目标图像之间的相似度。
  20. 根据权利要求19所述的服务器,其特征在于,所述计算所述目标图像中各个颜色分量取值的像素点的分布比率包括:
    根据下式计算所述目标图像中各个颜色分量取值的像素点的分布比率:
    Figure PCTCN2019118149-appb-100007
    其中,PN1为所述目标图像的像素点总数,StRPixNum pv为所述目标图像中红色分量取值为pv的像素点的总数,StBPixNum pv为所述目标图像中蓝色分量取值为pv的像素点的总数,StGPixNum pv为所述目标图像中绿色分量取值为pv的像素点的总数。
PCT/CN2019/118149 2019-07-19 2019-11-13 基于图像识别的网页取证方法、装置、存储介质及服务器 WO2021012522A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910652650.7A CN110472128B (zh) 2019-07-19 2019-07-19 基于图像识别的网页取证方法、装置、存储介质及服务器
CN201910652650.7 2019-07-19

Publications (1)

Publication Number Publication Date
WO2021012522A1 true WO2021012522A1 (zh) 2021-01-28

Family

ID=68508759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118149 WO2021012522A1 (zh) 2019-07-19 2019-11-13 基于图像识别的网页取证方法、装置、存储介质及服务器

Country Status (2)

Country Link
CN (1) CN110472128B (zh)
WO (1) WO2021012522A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110969143A (zh) * 2019-12-19 2020-04-07 深圳壹账通智能科技有限公司 基于图像识别的取证方法、系统、计算机设备及存储介质
CN112507271B (zh) * 2020-12-14 2023-03-24 杭州趣链科技有限公司 网页取证方法、装置及设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103501470A (zh) * 2013-10-17 2014-01-08 珠海迈科电子科技有限公司 网络数据筛选方法及装置
CN103942285A (zh) * 2014-04-09 2014-07-23 北京搜狗科技发展有限公司 一种针对页面动态元素的推荐方法和系统
CN107832384A (zh) * 2017-10-28 2018-03-23 北京安妮全版权科技发展有限公司 侵权检测方法、装置、存储介质和电子设备
CN109614917A (zh) * 2018-12-06 2019-04-12 安徽海豚新媒体产业发展有限公司 一种基于比对信息的视频画面智能提取方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105577354B (zh) * 2015-12-10 2019-01-22 陕西师范大学 基于概率区间划分和动态概率事件的图像加密和解密方法
CN108133491A (zh) * 2017-12-29 2018-06-08 重庆锐纳达自动化技术有限公司 一种实现动态目标跟踪的方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103501470A (zh) * 2013-10-17 2014-01-08 珠海迈科电子科技有限公司 网络数据筛选方法及装置
CN103942285A (zh) * 2014-04-09 2014-07-23 北京搜狗科技发展有限公司 一种针对页面动态元素的推荐方法和系统
CN107832384A (zh) * 2017-10-28 2018-03-23 北京安妮全版权科技发展有限公司 侵权检测方法、装置、存储介质和电子设备
CN109614917A (zh) * 2018-12-06 2019-04-12 安徽海豚新媒体产业发展有限公司 一种基于比对信息的视频画面智能提取方法

Also Published As

Publication number Publication date
CN110472128A (zh) 2019-11-19
CN110472128B (zh) 2022-09-02

Similar Documents

Publication Publication Date Title
CN110210883B (zh) 群控账号识别方法、装置、服务器及存储介质
WO2018120722A1 (zh) 异步接口测试方法、终端、设备、系统及存储介质
KR20190014098A (ko) 일치하는 컨텐츠를 식별하는 시스템 및 방법
WO2022134584A1 (zh) 房产图片验证方法、装置、计算机设备及存储介质
WO2021012522A1 (zh) 基于图像识别的网页取证方法、装置、存储介质及服务器
CN110445771B (zh) 基于区块链的交互记录取证方法、装置、介质及服务器
CN111242317A (zh) 管理应用的方法、装置、计算机设备和存储介质
WO2021012521A1 (zh) 基于搜索的网页取证方法、装置、可读存储介质及服务器
Actoriano et al. Forensic Investigation on WhatsApp Web Using Framework Integrated Digital Forensic Investigation Framework Version 2
CN113469866A (zh) 数据处理方法、装置和服务器
US20160110531A1 (en) Information processing apparatus, terminal apparatus and information processing method
CN107819748A (zh) 一种抗破解的验证码实现方法及装置
US20150186751A1 (en) Image duplication detection
CN109040284B (zh) 信息展示及信息推送方法、装置、设备和存储介质
CN115221453B (zh) 媒体资源管理方法、装置、服务器、介质
US11539711B1 (en) Content integrity processing on browser applications
CN113626438B (zh) 一种数据表管理的方法、装置、计算机设备及存储介质
CN111382394A (zh) 一种图片的侵权监控方法及装置
TWM626928U (zh) 線上投保簽核系統
CN112035205A (zh) 数据处理方法、装置、设备和存储介质
Chang Evidence gathering of instagram on windows 10
WO2021012502A1 (zh) 截屏信息控制方法、装置、计算机设备及存储介质
CN110599271A (zh) 票据检测方法、装置、计算机设备和存储介质
TWM587784U (zh) 關鍵字廣告惡意點擊分析系統
US20240152934A1 (en) Contact verification and non-repudiation system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938185

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19938185

Country of ref document: EP

Kind code of ref document: A1