CN104200499A - Technical method for intelligently removing reduplication of information images - Google Patents

Technical method for intelligently removing reduplication of information images Download PDF

Info

Publication number
CN104200499A
CN104200499A CN201410490922.5A CN201410490922A CN104200499A CN 104200499 A CN104200499 A CN 104200499A CN 201410490922 A CN201410490922 A CN 201410490922A CN 104200499 A CN104200499 A CN 104200499A
Authority
CN
China
Prior art keywords
picture
pictures
images
pixel
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410490922.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ying Weinuo Science And Technology Ltd Of Shenzhen
Original Assignee
Ying Weinuo Science And Technology Ltd Of Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ying Weinuo Science And Technology Ltd Of Shenzhen filed Critical Ying Weinuo Science And Technology Ltd Of Shenzhen
Priority to CN201410490922.5A priority Critical patent/CN104200499A/en
Publication of CN104200499A publication Critical patent/CN104200499A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses an image reduplication removing method, and particularly relates to a method for removing reduplication of similar images. The method comprises the following steps of performing SHA1 (secure hash algorithm) code processing on the images; and storing corresponding SHA1 codes of the images in an HASH manner according to the period of validity. The SHA1 codes of identical images are the same. By the method, whether two images are the same or not can be judged by low cost; if the images are corrected, the images are required to be zoomed, features of the images are extracted, and similar values of the two images are judged by a Hamming distance; the problem that similar or identical images appear repeatedly can be solved; and the reading experience of a user is improved.

Description

A kind of technical method of information picture intelligence duplicate removal
Technical field
The present invention relates to the technical field of smart mobile phone information software picture duplicate removal, relate to especially one in internet applications field, to the technical method of picture judgement duplicate removal.
Background technology
Along with the explosive growth of internet information, the picture spreading on internet is also in continuous growth, for same pictures by watermark, cut out or the picture of alternate manner processing, human eye is differentiated and is seen more, and is all difficult to distinguish, in addition is machine.For a large amount of information pictures as artificial treatment, cost not only consuming time but also large, and effect is also bad.So how to invent one and distinguish by machine intelligence, efficiently fast and effectively picture is distinguished to duplicate removal, the picture particularly those being similar to very much carries out effectively screening fast, just particularly important.
Whether picture duplicate removal can be judged two pictures identical or approximate identical, in the time that two similar pictures will show user simultaneously, by effective method, avoids showing user simultaneously.A kind of method of picture duplicate removal is at present: picture is done to the processing of SHA1 code, SHA1 code corresponding picture is stored away according to the term of validity by HASH hash.The same for identical picture SHA1 code.Whether this method can compare two pictures cheaply identical, but for the picture of revising, needs picture to carry out convergent-divergent, extracts characteristic, then uses Hamming distance to judge the similar value of two pictures.
Summary of the invention
Fundamental purpose of the present invention is to provide a kind of picture duplicate removal, particularly a kind of solution of approximate picture duplicate removal, and situation about repeating to solve similar or identical picture, promotes user's reading experience.
 
for addressing the above problem, provide following solution:
1. needing picture relatively to transfer SHA1 code to, compare with other pictures SHA1 code being kept in HASH hash, find out identical picture by SHA1 code.This method is full blast and cost-effective.
If 2. SHA1 code can not be searched out.Picture is intercepted.The picture arriving according to the observation, the watermark of most of picture all below.So need to intercept picture, watermark is cut outside more former figure.
3. extract the characteristic of picture.Being the thumbnail of 8*8 size the picture indentation after intercepting, is 64 pixels.
4. according to the position of pixel, calculate the scale-of-two fingerprint of picture.
5. calculate the Hamming distance of picture fingerprint between two.Can compare the phase recency of two pictures according to Hamming distance.
embodiment:
One, in the time having picture to issue, first picture is changed into SHA1 code, compare with the HASH set being kept in redis database.If find identically, prove that this pictures issued.If different, the SHA1 code of this picture is kept in HASH set, proceed following steps.
Two, picture is intercepted, the intermediate characteristic of picture is intercepted out.Effectively filter the watermark part on picture side.Improve the accuracy of judgement.
Three, the feature of the picture after intercepting being extracted picture, the feature of picture can show by thumbnail, so picture is abbreviated to 8*8 size, the thumbnail that pixel is 64.
Four, extract the value of each pixel, try to achieve the mean value of 64 pixels.
Five, each pixel and the average of this picture are compared, if this value is greater than average, be designated as 1, if be less than average, be designated as 0.This will form the binary code of 64.This is exactly the fingerprint of this picture.This fingerprint of 64 is sequential, the figure place of this fingerprint code of the position correspondence of each 8*8 pixel.
Six, take out the fingerprint code of having issued picture, contrast with this fingerprint code.Contrast the value of Hamming distance, if this value is greater than 5, prove that this two pictures is dissimilar.If Hamming distance in 5, judges that two pictures are similar, Hamming distance is fewer, and picture is more similar.
Seven,, for the Hamming distance that is greater than 5, just this fingerprint code is saved in to HASH set., can not issue with interior picture for 5.

Claims (6)

1. a method for the approximate duplicate removal of picture, is characterized in that, described method comprises:
To same pictures, carry out stamp, watermark, cut out, pixel tails off or becomes many, the picture that can follow former figure or same Zhang Butong to edit is identified, judge as same pictures, to carry out the comparison of SHA1 code to the picture of judgement, if the same, be judged as identical, if more not out, need that picture is carried out to convergent-divergent and extract picture feature, then the pixel of picture is averaged to processing, generate in order the binary code of 64, there is relatively in the past again the Hamming distance of 64 bit codes in HASH set, if Hamming distance is less than certain threshold values, prove existing similar pictures issue.
2. method according to claim 1, is characterized in that, also comprises:
The picture SHA1 code in past need to be kept in HASH set, make two more effective judging of duplicate pictures energy, not need to continue again judgement.
3. method according to claim 1, is characterized in that, also comprises:
Every pictures is extracted to feature, picture is condensed to the thumbnail of 8*8 size, pixel is 64, to the value of each pixel, value to 64 pixels is averaged, and re-uses the standard of this mean value as secondary, generates the binary code of 64 according to pixel order.
4. method according to claim 1, is characterized in that, also comprises:
64 binary codes that generate are asked to Hamming distance with being kept at all binary codes with issue picture in HASH set, judge the approximate of two pictures according to threshold values.
5. method according to claim 4, is characterized in that, also comprises
Need to, according to the pixel of picture corresponding generation one by one, can not there is out of order situation in 64 binary codes that generate all thumbnails.
6. according to the method for claim 1, it is characterized in that, also comprise:
The SHA1 code that all pictures generate and the binary code of 64, need to store according to Hash hash.
CN201410490922.5A 2014-09-24 2014-09-24 Technical method for intelligently removing reduplication of information images Pending CN104200499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410490922.5A CN104200499A (en) 2014-09-24 2014-09-24 Technical method for intelligently removing reduplication of information images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410490922.5A CN104200499A (en) 2014-09-24 2014-09-24 Technical method for intelligently removing reduplication of information images

Publications (1)

Publication Number Publication Date
CN104200499A true CN104200499A (en) 2014-12-10

Family

ID=52085785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410490922.5A Pending CN104200499A (en) 2014-09-24 2014-09-24 Technical method for intelligently removing reduplication of information images

Country Status (1)

Country Link
CN (1) CN104200499A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138245A (en) * 2015-09-30 2015-12-09 北京奇虎科技有限公司 Deduplication processing method and device for screenshot pictures of intelligent terminal
CN105930391A (en) * 2016-04-14 2016-09-07 京东方科技集团股份有限公司 Update method and image server of image sample database of super-resolution image system
CN106528743A (en) * 2016-11-01 2017-03-22 山东浪潮云服务信息科技有限公司 High-efficiency similar picture identification method based on picture mining technology
CN108416221A (en) * 2018-01-22 2018-08-17 西安电子科技大学 Safe set of metadata of similar data possesses proof scheme in a kind of cloud environment
CN109918518A (en) * 2019-01-31 2019-06-21 平安科技(深圳)有限公司 Picture duplicate checking method, apparatus, computer equipment and storage medium
CN110321447A (en) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 Determination method, apparatus, electronic equipment and the storage medium of multiimage

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745608A (en) * 1994-08-19 1998-04-28 Hewlett-Packard Company Storing data compressed with arithmetic coding in non-contiguous memory
US6452973B1 (en) * 1998-11-25 2002-09-17 Electronics And Telecommunications Research Institute System and method for converting H.261 compressed moving picture data to MPEG-1 compressed moving picture data on compression domain
CN101527829A (en) * 2008-03-07 2009-09-09 华为技术有限公司 Method and device for processing video data
CN103116628A (en) * 2013-01-31 2013-05-22 新浪网技术(中国)有限公司 Image file digital signature and judgment method and judgment device of repeated image file

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5745608A (en) * 1994-08-19 1998-04-28 Hewlett-Packard Company Storing data compressed with arithmetic coding in non-contiguous memory
US6452973B1 (en) * 1998-11-25 2002-09-17 Electronics And Telecommunications Research Institute System and method for converting H.261 compressed moving picture data to MPEG-1 compressed moving picture data on compression domain
CN101527829A (en) * 2008-03-07 2009-09-09 华为技术有限公司 Method and device for processing video data
CN103116628A (en) * 2013-01-31 2013-05-22 新浪网技术(中国)有限公司 Image file digital signature and judgment method and judgment device of repeated image file

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138245A (en) * 2015-09-30 2015-12-09 北京奇虎科技有限公司 Deduplication processing method and device for screenshot pictures of intelligent terminal
CN105138245B (en) * 2015-09-30 2018-06-29 北京奇虎科技有限公司 A kind of duplicate removal treatment method and device of intelligent terminal screenshot picture
CN105930391A (en) * 2016-04-14 2016-09-07 京东方科技集团股份有限公司 Update method and image server of image sample database of super-resolution image system
CN105930391B (en) * 2016-04-14 2019-06-07 京东方科技集团股份有限公司 Update method and image server of the supersolution as image sample data library in system
CN106528743A (en) * 2016-11-01 2017-03-22 山东浪潮云服务信息科技有限公司 High-efficiency similar picture identification method based on picture mining technology
CN108416221A (en) * 2018-01-22 2018-08-17 西安电子科技大学 Safe set of metadata of similar data possesses proof scheme in a kind of cloud environment
CN109918518A (en) * 2019-01-31 2019-06-21 平安科技(深圳)有限公司 Picture duplicate checking method, apparatus, computer equipment and storage medium
CN110321447A (en) * 2019-07-08 2019-10-11 北京字节跳动网络技术有限公司 Determination method, apparatus, electronic equipment and the storage medium of multiimage

Similar Documents

Publication Publication Date Title
CN104200499A (en) Technical method for intelligently removing reduplication of information images
US11157720B2 (en) Method and device for determining path of human target
CN107209853B (en) Positioning and map construction method
CN104574331B (en) A kind of data processing method, device, computer storage medium and user terminal
CN104081435A (en) Image matching method based on cascading binary encoding
Petrelli et al. A repeatable and efficient canonical reference for surface matching
CN109426785A (en) A kind of human body target personal identification method and device
CN105809651A (en) Image saliency detection method based on edge non-similarity comparison
Kharrazi et al. Improving steganalysis by fusion techniques: A case study with image steganography
CN107844742A (en) Facial image glasses minimizing technology, device and storage medium
CN107845118B (en) Data image processing method
CN109214229A (en) A kind of bar code scanning method, device and electronic equipment
CN104392439B (en) The method and apparatus for determining image similarity
CN105117757A (en) Quick response code encryption and decryption method based on random textures
CN109063716A (en) A kind of image-recognizing method, device, equipment and computer readable storage medium
CN104376307A (en) Fingerprint image information coding method
CN108109164B (en) Information processing method and electronic equipment
JP5954212B2 (en) Image processing apparatus, image processing method, and image processing program
CN110223219B (en) 3D image generation method and device
CN115830712A (en) Gait recognition method, device, equipment and storage medium
CN109684496A (en) A kind of image matching method, device, equipment and the storage medium of same money commodity
CN105229700A (en) For extracting equipment and the method for peak image from multiple continuously shot images
CN105447841A (en) Image matching method and video processing method
Roslan et al. Reconstruction of egg shape using B-spline
Hadid et al. Recognition of blurred faces via facial deblurring combined with blur-tolerant descriptors

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 Guangdong city of Shenzhen province Nanshan District South Road four No. 18 SKYWORTH semiconductor design building no.5003 6 floor unit 605-610

Applicant after: Ying Weinuo Science and Technology Ltd. of Shenzhen

Address before: 518000 Guangdong province Shenzhen City South Road four SKYWORTH semiconductor building 6 floor

Applicant before: Ying Weinuo Science and Technology Ltd. of Shenzhen

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20141210

RJ01 Rejection of invention patent application after publication