CN104200499A - Technical method for intelligently removing reduplication of information images - Google Patents
Technical method for intelligently removing reduplication of information images Download PDFInfo
- Publication number
- CN104200499A CN104200499A CN201410490922.5A CN201410490922A CN104200499A CN 104200499 A CN104200499 A CN 104200499A CN 201410490922 A CN201410490922 A CN 201410490922A CN 104200499 A CN104200499 A CN 104200499A
- Authority
- CN
- China
- Prior art keywords
- picture
- pictures
- images
- pixel
- code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses an image reduplication removing method, and particularly relates to a method for removing reduplication of similar images. The method comprises the following steps of performing SHA1 (secure hash algorithm) code processing on the images; and storing corresponding SHA1 codes of the images in an HASH manner according to the period of validity. The SHA1 codes of identical images are the same. By the method, whether two images are the same or not can be judged by low cost; if the images are corrected, the images are required to be zoomed, features of the images are extracted, and similar values of the two images are judged by a Hamming distance; the problem that similar or identical images appear repeatedly can be solved; and the reading experience of a user is improved.
Description
Technical field
The present invention relates to the technical field of smart mobile phone information software picture duplicate removal, relate to especially one in internet applications field, to the technical method of picture judgement duplicate removal.
Background technology
Along with the explosive growth of internet information, the picture spreading on internet is also in continuous growth, for same pictures by watermark, cut out or the picture of alternate manner processing, human eye is differentiated and is seen more, and is all difficult to distinguish, in addition is machine.For a large amount of information pictures as artificial treatment, cost not only consuming time but also large, and effect is also bad.So how to invent one and distinguish by machine intelligence, efficiently fast and effectively picture is distinguished to duplicate removal, the picture particularly those being similar to very much carries out effectively screening fast, just particularly important.
Whether picture duplicate removal can be judged two pictures identical or approximate identical, in the time that two similar pictures will show user simultaneously, by effective method, avoids showing user simultaneously.A kind of method of picture duplicate removal is at present: picture is done to the processing of SHA1 code, SHA1 code corresponding picture is stored away according to the term of validity by HASH hash.The same for identical picture SHA1 code.Whether this method can compare two pictures cheaply identical, but for the picture of revising, needs picture to carry out convergent-divergent, extracts characteristic, then uses Hamming distance to judge the similar value of two pictures.
Summary of the invention
Fundamental purpose of the present invention is to provide a kind of picture duplicate removal, particularly a kind of solution of approximate picture duplicate removal, and situation about repeating to solve similar or identical picture, promotes user's reading experience.
for addressing the above problem, provide following solution:
1. needing picture relatively to transfer SHA1 code to, compare with other pictures SHA1 code being kept in HASH hash, find out identical picture by SHA1 code.This method is full blast and cost-effective.
If 2. SHA1 code can not be searched out.Picture is intercepted.The picture arriving according to the observation, the watermark of most of picture all below.So need to intercept picture, watermark is cut outside more former figure.
3. extract the characteristic of picture.Being the thumbnail of 8*8 size the picture indentation after intercepting, is 64 pixels.
4. according to the position of pixel, calculate the scale-of-two fingerprint of picture.
5. calculate the Hamming distance of picture fingerprint between two.Can compare the phase recency of two pictures according to Hamming distance.
embodiment:
One, in the time having picture to issue, first picture is changed into SHA1 code, compare with the HASH set being kept in redis database.If find identically, prove that this pictures issued.If different, the SHA1 code of this picture is kept in HASH set, proceed following steps.
Two, picture is intercepted, the intermediate characteristic of picture is intercepted out.Effectively filter the watermark part on picture side.Improve the accuracy of judgement.
Three, the feature of the picture after intercepting being extracted picture, the feature of picture can show by thumbnail, so picture is abbreviated to 8*8 size, the thumbnail that pixel is 64.
Four, extract the value of each pixel, try to achieve the mean value of 64 pixels.
Five, each pixel and the average of this picture are compared, if this value is greater than average, be designated as 1, if be less than average, be designated as 0.This will form the binary code of 64.This is exactly the fingerprint of this picture.This fingerprint of 64 is sequential, the figure place of this fingerprint code of the position correspondence of each 8*8 pixel.
Six, take out the fingerprint code of having issued picture, contrast with this fingerprint code.Contrast the value of Hamming distance, if this value is greater than 5, prove that this two pictures is dissimilar.If Hamming distance in 5, judges that two pictures are similar, Hamming distance is fewer, and picture is more similar.
Seven,, for the Hamming distance that is greater than 5, just this fingerprint code is saved in to HASH set., can not issue with interior picture for 5.
Claims (6)
1. a method for the approximate duplicate removal of picture, is characterized in that, described method comprises:
To same pictures, carry out stamp, watermark, cut out, pixel tails off or becomes many, the picture that can follow former figure or same Zhang Butong to edit is identified, judge as same pictures, to carry out the comparison of SHA1 code to the picture of judgement, if the same, be judged as identical, if more not out, need that picture is carried out to convergent-divergent and extract picture feature, then the pixel of picture is averaged to processing, generate in order the binary code of 64, there is relatively in the past again the Hamming distance of 64 bit codes in HASH set, if Hamming distance is less than certain threshold values, prove existing similar pictures issue.
2. method according to claim 1, is characterized in that, also comprises:
The picture SHA1 code in past need to be kept in HASH set, make two more effective judging of duplicate pictures energy, not need to continue again judgement.
3. method according to claim 1, is characterized in that, also comprises:
Every pictures is extracted to feature, picture is condensed to the thumbnail of 8*8 size, pixel is 64, to the value of each pixel, value to 64 pixels is averaged, and re-uses the standard of this mean value as secondary, generates the binary code of 64 according to pixel order.
4. method according to claim 1, is characterized in that, also comprises:
64 binary codes that generate are asked to Hamming distance with being kept at all binary codes with issue picture in HASH set, judge the approximate of two pictures according to threshold values.
5. method according to claim 4, is characterized in that, also comprises
Need to, according to the pixel of picture corresponding generation one by one, can not there is out of order situation in 64 binary codes that generate all thumbnails.
6. according to the method for claim 1, it is characterized in that, also comprise:
The SHA1 code that all pictures generate and the binary code of 64, need to store according to Hash hash.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410490922.5A CN104200499A (en) | 2014-09-24 | 2014-09-24 | Technical method for intelligently removing reduplication of information images |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410490922.5A CN104200499A (en) | 2014-09-24 | 2014-09-24 | Technical method for intelligently removing reduplication of information images |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104200499A true CN104200499A (en) | 2014-12-10 |
Family
ID=52085785
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410490922.5A Pending CN104200499A (en) | 2014-09-24 | 2014-09-24 | Technical method for intelligently removing reduplication of information images |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104200499A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138245A (en) * | 2015-09-30 | 2015-12-09 | 北京奇虎科技有限公司 | Deduplication processing method and device for screenshot pictures of intelligent terminal |
CN105930391A (en) * | 2016-04-14 | 2016-09-07 | 京东方科技集团股份有限公司 | Update method and image server of image sample database of super-resolution image system |
CN106528743A (en) * | 2016-11-01 | 2017-03-22 | 山东浪潮云服务信息科技有限公司 | High-efficiency similar picture identification method based on picture mining technology |
CN108416221A (en) * | 2018-01-22 | 2018-08-17 | 西安电子科技大学 | Safe set of metadata of similar data possesses proof scheme in a kind of cloud environment |
CN109918518A (en) * | 2019-01-31 | 2019-06-21 | 平安科技(深圳)有限公司 | Picture duplicate checking method, apparatus, computer equipment and storage medium |
CN110321447A (en) * | 2019-07-08 | 2019-10-11 | 北京字节跳动网络技术有限公司 | Determination method, apparatus, electronic equipment and the storage medium of multiimage |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5745608A (en) * | 1994-08-19 | 1998-04-28 | Hewlett-Packard Company | Storing data compressed with arithmetic coding in non-contiguous memory |
US6452973B1 (en) * | 1998-11-25 | 2002-09-17 | Electronics And Telecommunications Research Institute | System and method for converting H.261 compressed moving picture data to MPEG-1 compressed moving picture data on compression domain |
CN101527829A (en) * | 2008-03-07 | 2009-09-09 | 华为技术有限公司 | Method and device for processing video data |
CN103116628A (en) * | 2013-01-31 | 2013-05-22 | 新浪网技术(中国)有限公司 | Image file digital signature and judgment method and judgment device of repeated image file |
-
2014
- 2014-09-24 CN CN201410490922.5A patent/CN104200499A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5745608A (en) * | 1994-08-19 | 1998-04-28 | Hewlett-Packard Company | Storing data compressed with arithmetic coding in non-contiguous memory |
US6452973B1 (en) * | 1998-11-25 | 2002-09-17 | Electronics And Telecommunications Research Institute | System and method for converting H.261 compressed moving picture data to MPEG-1 compressed moving picture data on compression domain |
CN101527829A (en) * | 2008-03-07 | 2009-09-09 | 华为技术有限公司 | Method and device for processing video data |
CN103116628A (en) * | 2013-01-31 | 2013-05-22 | 新浪网技术(中国)有限公司 | Image file digital signature and judgment method and judgment device of repeated image file |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138245A (en) * | 2015-09-30 | 2015-12-09 | 北京奇虎科技有限公司 | Deduplication processing method and device for screenshot pictures of intelligent terminal |
CN105138245B (en) * | 2015-09-30 | 2018-06-29 | 北京奇虎科技有限公司 | A kind of duplicate removal treatment method and device of intelligent terminal screenshot picture |
CN105930391A (en) * | 2016-04-14 | 2016-09-07 | 京东方科技集团股份有限公司 | Update method and image server of image sample database of super-resolution image system |
CN105930391B (en) * | 2016-04-14 | 2019-06-07 | 京东方科技集团股份有限公司 | Update method and image server of the supersolution as image sample data library in system |
CN106528743A (en) * | 2016-11-01 | 2017-03-22 | 山东浪潮云服务信息科技有限公司 | High-efficiency similar picture identification method based on picture mining technology |
CN108416221A (en) * | 2018-01-22 | 2018-08-17 | 西安电子科技大学 | Safe set of metadata of similar data possesses proof scheme in a kind of cloud environment |
CN109918518A (en) * | 2019-01-31 | 2019-06-21 | 平安科技(深圳)有限公司 | Picture duplicate checking method, apparatus, computer equipment and storage medium |
CN110321447A (en) * | 2019-07-08 | 2019-10-11 | 北京字节跳动网络技术有限公司 | Determination method, apparatus, electronic equipment and the storage medium of multiimage |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104200499A (en) | Technical method for intelligently removing reduplication of information images | |
US11157720B2 (en) | Method and device for determining path of human target | |
CN107209853B (en) | Positioning and map construction method | |
CN104574331B (en) | A kind of data processing method, device, computer storage medium and user terminal | |
CN104081435A (en) | Image matching method based on cascading binary encoding | |
Petrelli et al. | A repeatable and efficient canonical reference for surface matching | |
CN109426785A (en) | A kind of human body target personal identification method and device | |
CN105809651A (en) | Image saliency detection method based on edge non-similarity comparison | |
Kharrazi et al. | Improving steganalysis by fusion techniques: A case study with image steganography | |
CN107844742A (en) | Facial image glasses minimizing technology, device and storage medium | |
CN107845118B (en) | Data image processing method | |
CN109214229A (en) | A kind of bar code scanning method, device and electronic equipment | |
CN104392439B (en) | The method and apparatus for determining image similarity | |
CN105117757A (en) | Quick response code encryption and decryption method based on random textures | |
CN109063716A (en) | A kind of image-recognizing method, device, equipment and computer readable storage medium | |
CN104376307A (en) | Fingerprint image information coding method | |
CN108109164B (en) | Information processing method and electronic equipment | |
JP5954212B2 (en) | Image processing apparatus, image processing method, and image processing program | |
CN110223219B (en) | 3D image generation method and device | |
CN115830712A (en) | Gait recognition method, device, equipment and storage medium | |
CN109684496A (en) | A kind of image matching method, device, equipment and the storage medium of same money commodity | |
CN105229700A (en) | For extracting equipment and the method for peak image from multiple continuously shot images | |
CN105447841A (en) | Image matching method and video processing method | |
Roslan et al. | Reconstruction of egg shape using B-spline | |
Hadid et al. | Recognition of blurred faces via facial deblurring combined with blur-tolerant descriptors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 518000 Guangdong city of Shenzhen province Nanshan District South Road four No. 18 SKYWORTH semiconductor design building no.5003 6 floor unit 605-610 Applicant after: Ying Weinuo Science and Technology Ltd. of Shenzhen Address before: 518000 Guangdong province Shenzhen City South Road four SKYWORTH semiconductor building 6 floor Applicant before: Ying Weinuo Science and Technology Ltd. of Shenzhen |
|
CB02 | Change of applicant information | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20141210 |
|
RJ01 | Rejection of invention patent application after publication |