WO2022156538A1 - 利用共享图片生成文件的方法、服务器端及可读存储介质 - Google Patents
利用共享图片生成文件的方法、服务器端及可读存储介质 Download PDFInfo
- Publication number
- WO2022156538A1 WO2022156538A1 PCT/CN2022/070348 CN2022070348W WO2022156538A1 WO 2022156538 A1 WO2022156538 A1 WO 2022156538A1 CN 2022070348 W CN2022070348 W CN 2022070348W WO 2022156538 A1 WO2022156538 A1 WO 2022156538A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- target
- pictures
- file
- captured
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 75
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 13
- 238000004590 computer program Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 3
- 238000004891 communication Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 239000002131 composite material Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 241001156002 Anthonomus pomorum Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/176—Support for shared access to files; File sharing support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/587—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
Definitions
- the invention relates to the technical field of artificial intelligence, and in particular, to a method for generating files by using shared pictures, a server side and a readable storage medium.
- the purpose of the present invention is to provide a method, a server side and a readable storage medium for generating a file by using a shared picture, so as to solve the problem that when the file is recorded by taking a picture, the picture taken lacks coherence and the content of the file is not displayed completely. .
- the present invention provides a method for generating files by using shared pictures, including:
- Receive the captured pictures from the client and classify all the captured pictures according to the received picture feature information of the captured pictures, so as to store the captured pictures belonging to the same file in the same candidate set;
- the picture content of a plurality of the captured pictures in the same candidate set is identified and/or compared to select from the candidate set.
- the method of generating the target picture and forming the target picture set includes:
- one of the captured pictures from each of the sub-candidate sets is selected as a target picture to be included in the target picture set.
- the picture content of a plurality of the shot pictures in the same candidate set is identified, so as to select the one whose edge integrity and clarity meet the conditions.
- the method for taking pictures includes:
- the edge integrity of each of the captured pictures is identified, so as to select the captured pictures whose edge integrity meets the conditions;
- the sharpness of the captured pictures whose edge integrity meets the conditions is identified, so as to select the captured pictures whose edge integrity and sharpness meet the conditions.
- the picture content of a plurality of the captured pictures in one of the candidate sets is identified and/or compared to select from the candidate set.
- the method for forming the target picture set correspondingly includes:
- the difference between the sharpness and the sharpness that meets the condition is less than the set value, the one with the highest integrity
- the captured pictures are incorporated into the target picture set.
- the picture content of a plurality of the captured pictures whose edge integrity and clarity meet the conditions are compared, so as to compare the picture content of the pictures with different picture contents.
- Methods for taking pictures for inclusion in different sub-candidate sets of the candidate set include:
- Using a character recognition model identify whether the page numbers of a plurality of the captured pictures are the same, and if so, it is determined that the content of the pictures is the same, and they are included in the same sub-candidate set; and/or,
- a picture feature extraction model is used to extract the similarity of the picture feature values of a plurality of the captured pictures. When the similarity of the picture feature values reaches a preset similarity threshold, it is determined that the content of the pictures is the same and is included in the same sub-candidate set.
- the preset selection rule includes:
- the shot picture with the highest ranking is used as the target picture.
- the preset sorting rule includes: sorting according to the relationship between the shooting time, page number and/or title of the pictures.
- the method for synthesizing the target file includes:
- the target file is updated by using the new target picture in the target picture set to obtain the updated target file, until the target picture set is no longer available within a set time. Update again.
- the method for sorting all the target pictures in the target picture set and synthesizing the target file according to the sorting result further includes:
- the current target picture is replaced with the new target picture, so that the current target file is replaced by a new target picture. to update.
- the method for sorting all the target pictures in the target picture set and synthesizing the target file according to the sorting result includes:
- whether there is an end identification character in each of the target pictures is identified according to a character recognition model, so as to determine whether the target picture is an end page.
- the picture feature information of the captured picture includes one or more of picture location information, user input information and picture content information.
- the method for generating a file by using a shared picture further includes:
- the synthesized target file is output by using a preset output template, and/or the synthesized target file is modified.
- the method for generating files by using shared pictures after sorting all the target pictures, the method for generating files by using shared pictures further includes:
- the sorting result is displayed to the client, and after the client confirms, the target file is synthesized according to the sorting result.
- the present invention also provides a server side, including a processor and a memory, the memory stores a computer program, and when the computer program is executed by the processor, the above-mentioned method for generating a file using a shared picture is implemented.
- the server side has a sharing entry, and the sharing entry is used to share the synthesized file to the public platform or other terminals.
- the present invention also provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the above-mentioned method for generating a file by using a shared picture is implemented.
- the captured picture from the user end is received, and according to the received picture feature information of the captured picture, all the described Classifying the captured pictures, so as to store the captured pictures belonging to the same file in the same candidate set; identifying and/or comparing the picture contents of a plurality of the captured pictures in the same candidate set, so as to obtain information from the candidate set.
- FIG. 1 is a flowchart of a method for generating a file by using a shared picture provided by an embodiment of the present invention
- FIG. 2 is a flowchart of forming a target picture set in an embodiment of the present invention
- FIG. 3 is an example diagram of sorting according to the order of titles in an embodiment of the present invention.
- embodiments of the present invention provide a method, a server and a readable storage medium for generating files by using shared pictures.
- FIG. 1 is a flowchart of a method for generating a file by using a shared picture provided by an embodiment of the present invention. As shown in FIG. 1 , an embodiment of the present invention provides a method for generating a file by using a shared picture, including the following steps:
- S11 receiving the captured pictures from the user terminal, and classifying all the captured pictures according to the received picture feature information of the captured pictures, so as to store the captured pictures belonging to the same file in the same candidate set;
- S13 Sort all the target pictures in the target picture set according to a preset sorting rule, and synthesize a target file according to the sorting result.
- the method for generating a file by using a shared picture in the embodiment of the present invention can be applied to the server side in the embodiment of the present invention.
- the server side is a public server side, for example, it can be a personal computer, a mobile terminal, etc., and the mobile terminal can be a hardware device with various operating systems, such as a mobile phone and a tablet computer.
- the synthesis target files in this article include word documents, PDF documents, excel documents, ppt documents, and txt documents.
- the server can sequentially classify, identify and/or compare and sort all the obtained photos, and finally synthesize the target file.
- the photos taken lack coherence, and the content of the files is not displayed completely.
- the picture feature information of the captured picture includes one or more of picture location information, user input information and picture content information.
- Pictures with the same location information are preliminarily identified as pictures of the same file.
- different files are shot and uploaded in the same location, such as different rooms in the same building. In this case, it is necessary to combine other picture feature information to judge.
- the user input information is, for example, the meeting location, meeting name, meeting content, etc. input by the user when uploading a picture. If the input information of different users is the same, it is considered that they are shooting the same file and need to be placed in the same candidate set.
- the picture content information is, for example, the title of the file, etc., and the features of the pictures uploaded by the user can also be identified according to the pre-trained identification model. If the features of the pictures taken by different users are similar, it is considered that they are shooting the same file, and at the same time the same.
- the pictures taken and uploaded by the user within the set time are regarded as the pictures of the same file (after the set time, they may go to shoot other files), and they are put into the same candidate set, so all the pictures of the same file taken by different users are used. are placed in the same candidate set.
- step S12 by identifying and/or comparing the picture contents of a plurality of the shot pictures in the same candidate set, the shot pictures are screened, so that the target pictures in the target picture set are all uniform. For a clear and complete picture of the target.
- identifying and/or comparing the picture contents of a plurality of the captured pictures in the same candidate set to select a target picture from the candidate set to form a target picture set may include the following steps:
- step S121 the method of identifying the picture content of a plurality of the captured pictures in the same candidate set, so as to select the captured pictures whose edge integrity and sharpness satisfy the conditions include:
- the edge integrity of each of the captured pictures is identified, so as to select the captured pictures whose edge integrity meets the conditions;
- the sharpness of the captured pictures whose edge integrity meets the conditions is identified, so as to select the captured pictures whose edge integrity and sharpness meet the conditions.
- an edge integrity threshold and a blurriness threshold can be defined to determine whether the edge integrity and sharpness of the captured picture satisfy the conditions, respectively.
- an edge recognition model is used to identify the edge integrity of each of the captured pictures, and if the edge integrity of the captured picture is greater than the edge integrity threshold, it is determined that the edge integrity of the captured picture satisfies Condition, using a blur degree identification model to identify the blur value of each captured picture, if the blur value of the captured picture is less than the blur degree threshold, it is determined that the clarity of the captured picture satisfies the condition.
- the edge integrity threshold is set to a relatively low value to avoid, for example, that all the captured pictures are incomplete pictures and are not selected due to occlusion, resulting in page missing of the acquired target file.
- a relatively high value can also be set, so as to avoid, for example, due to the problem of the shooting scene (such as low light), all the captured pictures are not very clear and are not selected. , which eventually leads to page faults in the acquired target file.
- priority is given to picture integrity between picture integrity and picture clarity, so as to ensure that at least the file content is not missing, and at the same time improve the screening rate.
- an edge recognition model may also be used to identify all the captured pictures whose edge integrity satisfies the condition, and at the same time, all the captured pictures whose sharpness satisfies the condition may be identified by using a ambiguity recognition model, and then, for For all the captured pictures of the same picture content, the captured pictures whose completeness and clarity meet the conditions are selected.
- the edge recognition model can be obtained through pre-training.
- the missing part can be completed according to the recognized edge line, and then the missing part can be completed according to the recognized area. It is judged by the ratio of the length of the identified edge line to the complete area, or the ratio of the length of the identified edge line to the length of the overall edge line to judge the edge integrity.
- the ambiguity recognition model can use OpenCV and Laplacian to calculate the variance of the picture, that is, the Laplacian variance algorithm (Variance of the Laplacian), a certain channel in the picture (generally using the gray value ) Use the Laplacian mask to do the convolution operation, and then calculate the variance (ie, the square of the standard deviation). The larger the variance of the picture, the higher the picture clarity. If the variance of the picture is less than the pre-defined threshold, then the picture can be considered that the sharpness does not meet the condition; if the variance of the picture is higher than the pre-defined threshold, then the picture can be considered that the sharpness satisfies the condition.
- the Laplacian variance algorithm Variance of the Laplacian
- a certain channel in the picture generally using the gray value
- the variance ie, the square of the standard deviation
- the sharpness threshold can also be used, that is, the calculated variance of the picture is converted into the sharpness of the image.
- the definition threshold can be adjusted manually according to the actual situation under the condition that the human eye can recognize the content of the picture.
- the ambiguity identification model can also use the grayscale variance algorithm, the grayscale difference square sum variance algorithm, the Brenner function, etc. well-known to those skilled in the art to determine whether the clarity of the captured picture satisfies the conditions, It is not repeated here.
- the method for identifying and/or comparing the acquired picture contents of the plurality of captured pictures, and selecting a target picture to form a target picture set further includes:
- the difference between the sharpness and the sharpness that meets the condition is less than the set value, the one with the highest integrity
- the captured pictures are incorporated into the target picture set.
- the integrity threshold to Threshold A
- the picture sharpness threshold to Threshold B
- the minimum sharpness condition the difference between the picture sharpness and the threshold B is less than the threshold C, if the picture content is D
- the request information of the picture with the picture content of D is sent to the client again. If the client does not send the picture with the picture content of D within the set time, Then, among all the captured pictures whose difference between the sharpness and the threshold value B is less than the threshold value C, the captured picture with the highest completeness is included in the target picture set.
- step S121 when setting the edge integrity threshold and the ambiguity threshold, a relatively high value can be set compared to the previous embodiment, so that the preliminarily screened pictures are pictures of relatively high quality.
- the edge integrity threshold can also be adjusted manually according to the actual situation. For example, if the edge area of the file is a blank area, or the text in the edge area does not affect the reading of the file, etc., the edge integrity threshold can be adjusted. Lower, for example, 80%, 85%, etc. That is, when the edge integrity of the captured picture exceeds 80%, 85%, etc., it can be included in the candidate set.
- step S12 of this embodiment in addition to identifying the picture content through step S121, the target pictures in the target picture set are all In order to make the target picture clear and complete, step S122 is also used to compare the picture contents of the plurality of captured pictures, so as to ensure the unity of the target picture in the target picture set.
- step S121 the picture contents of a plurality of the captured pictures whose edge integrity and sharpness meet the conditions are compared, so that the captured pictures with different picture contents are included in different sub-candidate sets of the candidate set methods include:
- Using a character recognition model identify whether the page numbers of a plurality of the captured pictures are the same, and if so, it is determined that the content of the pictures is the same, and they are included in the same sub-candidate set; and/or,
- a picture feature extraction model is used to extract the similarity of the picture feature values of a plurality of the captured pictures. When the similarity of the picture feature values reaches a preset similarity threshold, it is determined that the content of the pictures is the same and is included in the same sub-candidate set.
- step S122 there may be multiple pictures in the same sub-candidate set. Therefore, through step S123, pictures with the same content are screened to ensure that only one of the multiple captured pictures of the same picture content is selected into the target Picture set.
- the preset selection rule may include: ranking a plurality of the captured pictures in the same candidate selection set according to the completeness and/or clarity of the picture content; A picture is taken as the target picture.
- different selection modes can be set, such as mode 1, mode 2 and mode 3.
- Mode 1 indicates that the user prefers the best completeness, It is more inclined to have the best overall effect of completeness and clarity.
- the selection mode can be selected at the same time. If the user selects mode one, in this step, when ranking the multiple captured pictures in the same captured picture, according to The completeness of the content of the pictures is ranked. The greater the completeness, the higher the ranking. If the user selects mode 2, in this step, when ranking a plurality of the captured pictures in the same captured picture, according to the clarity of the picture content The higher the clarity, the higher the ranking.
- the preset sorting rule includes: sorting according to the correlation degree of shooting time, page number and/or title between pictures. details as follows:
- the preset sorting rule may be sorting according to the shooting time order of the pictures
- the specific process includes:
- the preset sorting rule may also be sorting according to the page numbers in the pictures
- the preset sorting rule may also be sorting according to the relevance of the content in the pictures
- the content relevance specifically refers to the next content that is adjacent to the current picture content and located after the current picture content.
- the order of large titles and subtitles, or using a text recognition model identify the degree of connection between the content before and after in the picture, and other sorting rules that can keep the target picture coherent.
- the first-level headings correspond one-to-one, for example, the sub-headings of the first-level heading 1 are 1.1, 1.2, 1.3..., the sub-headings of the sub-headings of the second-level heading 2 are 2.1, 2.2, 2.3..., the sub-headings of the sub-headings of the second-level heading 3 are sub-headings 3.1, 3.2, 3.3..., and then sort to form the target file shown in Figure 2.
- the preset sorting rules may also be a combination of the above sorting rules.
- the text recognition model can also be used to recognize the coherence of the content of the preceding and following pages in the text (the degree of connection of the contextual content).
- step S13 the following two implementations may be used to sort all the target pictures in the target picture set, and synthesize target files according to the sorting result.
- Embodiment 1 is a diagrammatic representation of Embodiment 1:
- the target files organized in order are always the latest and displayed to the user.
- the current target picture is replaced with the new target picture, so that the current target picture is replaced by the new target picture.
- the target file is updated. That is, after the image on the server side is updated, if the completeness of the updated image is greater than the current target image with the same image content in the current document and the clarity satisfies the condition, the image is replaced to update the target file.
- Embodiment 2 is a diagrammatic representation of Embodiment 1:
- the pictures with the best quality that are not repeated in the target picture set are searched and sorted according to the preset order, and then synthesized.
- Whether there is an end identification character in each target picture can be identified according to a character recognition model, so as to determine whether the target picture is an end page.
- the end identification characters are, for example: thank you, contact information, thank you for listening, and other characters indicating the end.
- the method for generating a file by using a shared picture may further include: outputting the synthesized target file by using a preset output template, and/or performing an operation on the synthesized target file. Correction.
- the corrections include: rotation, scaling, movement, flipping, removal of shadows, backgrounds, labels, etc., and correction of inclinations, etc. of the picture.
- the preset output template is an editable template, including at least one template processing area, and each template processing area can be filled with text, filled with pictures, added with annotations, etc., so that the acquired character information, pictures, etc. Fill in the corresponding template page for the location.
- the preset output template may also be an uneditable template, which is not limited in this application.
- step S13 preferably, after sorting all the target pictures, the sorting results are first displayed to the user terminal, and after the user terminal confirms, the target files are synthesized according to the sorting results.
- the output form of the target file may be PDF text, Word text, Txt text, PPT text or other output forms, and preferably, it matches the original display form of the target file. If the text is photographed, the final synthesized target file will be output in the form of PPT. If the word text is photographed, the final synthesized target file will be output in the form of word.
- the method for generating a file by using a shared picture in the embodiment of the present invention can be applied to the server side in the embodiment of the present invention.
- the server provided in this embodiment includes: a processor and a memory, and a computer program is stored in the memory, and when the computer program is executed by the processor, the utilization sharing described in this embodiment is implemented How to generate files from images.
- the processor may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf processors. Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
- the processor is the control center of the electronic device, and uses various interfaces and lines to connect various parts of the entire electronic device.
- the memory can be used to store the computer program, and the processor implements various functions of the server by running or executing the computer program stored in the memory and calling the data stored in the memory.
- Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory may include random access memory (RAM) or external cache memory.
- RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
- the server side may also include a user interface, a network interface, and a communication bus.
- the user interface is used to receive information input by the user, such as a touch screen, a camera, and the like.
- the network interface is used for communication between the server and the outside world.
- the network interface mainly includes wired interface and wireless interface, such as RS232 module, radio frequency module, WIFI module and so on.
- the communication bus is used for communication between the components in the server side, and the communication bus can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- the communication bus can be divided into an address bus, a data bus, a control bus, and the like.
- the server side has a sharing entry, and the sharing entry is used to share the synthesized target file to a public platform or other terminals. That is, after the server side generates the composite text, the user can share the generated composite text to the designated sharing platform through the sharing portal for other users to browse. At the same time, the user can also share the synthesized text to the designated terminal through the sharing portal.
- This embodiment also provides a readable storage medium, where a computer program is stored in the readable storage medium, and when the computer program is executed by a processor, the method for generating a file by using a shared picture provided in this embodiment is implemented.
- the readable storage medium can be a tangible device that can hold and store instructions for use by the instruction execution device, such as, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or the above. any suitable combination. More specific examples (non-exhaustive list) of readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or Flash memory), static random access memory (SRAM), portable compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical encoding devices, and any suitable combination of the foregoing.
- RAM random access memory
- ROM read only memory
- EPROM or Flash memory erasable programmable read only memory
- SRAM static random access memory
- CD-ROM compact disc read only memory
- DVD digital versatile disc
- memory sticks floppy disks, mechanical encoding devices, and any suitable combination of the foregoing.
- the method, the server side and the readable storage medium provided by the present invention for generating a file by using a shared picture receive the captured pictures from the user end, and perform all the captured pictures according to the received picture feature information of the captured pictures. Perform classification to store the captured pictures belonging to the same file in the same candidate set; identify and/or compare the picture contents of a plurality of the captured pictures in the same candidate set to select from the candidate set extracting target pictures to form a target picture set; and, according to a preset sorting rule, sorting all the target pictures in the target picture set, and synthesizing a target file according to the sorting result. That is, the present invention synthesizes a target file based on pictures taken and uploaded by multiple users, thus solving the problems of lack of coherence of the pictures taken and incomplete presentation of the contents of the file when recording the file by taking pictures.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Processing Or Creating Images (AREA)
Abstract
一种利用共享图片生成文件的方法、服务器端及可读存储介质,该方法包括:接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中(S11);对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和或比对,以从所述候选集中选出目标图片形成目标图片集(S12);根据预设排序规则,对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件(S13)。基于多方用户所拍照上传的图片合成目标文件,如此,便可解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题。
Description
本发明涉及人工智能技术领域,特别涉及一种利用共享图片生成文件的方法、服务器端及可读存储介质。
目前在参加各种会议时,都会用到使用PPT的情况,参会者为了记录会议中的重要内容,往往会采取拍照的方式,将有用内容的PPT拍下来,然而很多参会者会由于距离较远、遮挡等而存在拍的照片模糊不清的情况。另外由于拍摄的照片缺少连贯性,很多内容展示的并不完整。
发明内容
本发明的目的在于提供一种利用共享图片生成文件的方法、服务器端及可读存储介质,以解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题。
为解决上述技术问题,本发明提供一种利用共享图片生成文件的方法,包括:
接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中;
对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片形成目标图片集;以及,
根据预设排序规则,对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件。
可选的,在所述的利用共享图片生成文件的方法中,所述对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,形成目标图片集的方法包括:
对同一所述候选集中的多个所述拍摄图片的图片内容进行识别,以选择 出边缘完整度和清晰度满足条件的所述拍摄图片;
对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中;以及,
利用预设选择规则,从每一所述子候选集中选择一所述拍摄图片作为目标图片纳入目标图片集中。
可选的,在所述的利用共享图片生成文件的方法中,所述对同一所述候选集中的多个所述拍摄图片的图片内容进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片的方法包括:
利用边缘识别模型,对各所述拍摄图片的边缘完整度进行识别,以选择出边缘完整度满足条件的所述拍摄图片;
利用模糊度识别模型,对边缘完整度满足条件的所述拍摄图片的清晰度进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片。
可选的,在所述的利用共享图片生成文件的方法中,所述对一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,相应形成目标图片集的方法还包括:
对于同一图片内容的所有所述拍摄图片,若完整度和清晰度均不满足条件,则向所述用户端发送完整度和清晰度不满足条件的所述拍摄图片的请求信息,以获取图片内容相同的新的拍摄图片;
若被请求的所述拍摄图片在设定时间内未发生更新,则将清晰度与满足条件的清晰度之间的差值小于设定值的所有所述拍摄图片中,完整度最高的所述拍摄图片纳入所述目标图片集中。
可选的,在所述的利用共享图片生成文件的方法中,所述对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中的方法包括:
利用字符识别模型,识别多个所述拍摄图片的图片内容的重复率,当重复率超过预设重复率阈值,则判定为图片内容相同,纳入同一子候选集中;
利用字符识别模型,识别多个所述拍摄图片的页码是否相同,若是,则判定为图片内容相同,纳入同一子候选集中;和/或,
利用图片特征提取模型,提取多个所述拍摄图片的图片特征值的相似度,当图片特征值的相似度达到预设相似度阈值,则判定为图片内容相同,纳入同一子候选集中。
可选的,在所述的利用共享图片生成文件的方法中,所述预设选择规则包括:
根据图片内容的完整度和/或清晰度对同一所述子候选集中的多个所述拍摄图片进行排名;以及,
将排名最高的所述拍摄图片作为所述目标图片。
可选的,在所述的利用共享图片生成文件的方法中,所述预设排序规则包括:根据图片之间拍摄时间、页码和/或标题的关联度进行排序。
可选的,在所述的利用共享图片生成文件的方法中,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果,合成目标文件的方法包括:
实时对当前所述目标图片集中的所有所述目标图片进行排序,并合成目标文件;
在所述目标图片集更新后,利用所述目标图片集中新的所述目标图片对所述目标文件更新,以得到更新后的所述目标文件,直至在设定时间内所述目标图片集不再发生更新。
可选的,在所述的利用共享图片生成文件的方法中,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件的方法还包括:
若所述目标图片集中出现至少完整度大于当前所述目标文件中相同图片内容的新的目标图片,则利用新的所述目标图片对当前所述目标图片进行替换,以对当前所述目标文件进行更新。
可选的,在所述的利用共享图片生成文件的方法中,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件的方法包括:
在所述目标图片集中出现结束页,或在设定时间内所述目标图片集不再发生更新后,对所述目标图片集中的所有所述目标图片进行排序,并合成目 标文件。
可选的,在所述的利用共享图片生成文件的方法中,根据字符识别模型识别各所述目标图片中是否存在结束标识字符,以判断所述目标图片是否为结束页。
可选的,在所述的利用共享图片生成文件的方法中,所述拍摄图片的图片特征信息包括:图片位置信息、用户输入信息和图片内容信息中的一种或多种。
可选的,在所述的利用共享图片生成文件的方法中,所述利用共享图片生成文件的方法还包括:
利用预设输出模板将合成的所述目标文件输出,和/或,对合成的所述目标文件进行修正。
可选的,在所述的利用共享图片生成文件的方法中,在对所有所述目标图片进行排序后,所述利用共享图片生成文件的方法还包括:
将排序结果显示给用户端,待所述用户端确认之后,再根据所述排序结果合成目标文件。
本发明还提供一种服务器端,包括处理器和存储器,所述存储器上存储有计算机程序,所述计算机程序被所述处理器执行时,实现如上所述的利用共享图片生成文件的方法。
可选的,在所述的服务器端中,所述服务器端具有分享入口,所述分享入口用于将合成的文件分享至公共平台或其它终端。
本发明还提供一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现如上所述的利用共享图片生成文件的方法。
综上所述,在本发明提供的利用共享图片生成文件的方法、服务器端及可读存储介质中,接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中;对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,形成目标图片集;以及,根据预设排序规则,对所述目标图片集中的所有所述目标图片进 行排序,并根据排序结果,合成目标文件。即,本发明基于多方用户所拍照上传的图片合成目标文件,如此,便可解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题。
图1为本发明实施例提供的利用共享图片生成文件的方法的流程图;
图2为本发明实施例中形成目标图片集的流程图;
图3为本发明实施例中根据标题的顺序来进行排序的示例图。
为使本发明的目的、优点和特征更加清楚,以下结合附图和具体实施例对本发明作详细说明。需说明的是,附图均采用非常简化的形式且未按比例绘制,仅用以方便、明晰地辅助说明本发明实施例的目的。此外,附图所展示的结构往往是实际结构的一部分。特别的,各附图需要展示的侧重点不同,有时会采用不同的比例。
为解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题,本发明实施例提供一种利用共享图片生成文件的方法、服务器端及可读存储介质。
图1为本发明实施例提供的利用共享图片生成文件的方法的流程图,如图1所示,本发明实施例提供一种利用共享图片生成文件的方法,包括如下步骤:
S11,接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中;
S12,对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片形成目标图片集;
S13,根据预设排序规则,对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件。
本发明实施例的利用共享图片生成文件的方法可应用于本发明实施例的 服务器端。其中,该服务器端为公共服务器端,例如可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。此外,需要说明的是,本文中的合成目标文件包括word文档、PDF文档、excel文档、ppt文档以及txt文档等。
在多个拍摄者将所拍摄文件的图片上传至服务器端后,服务器端即可对获取的所有照片依次进行分类、识别和/或比对、排序,最终合成目标文件,如此,便可解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题。
以下对本发明实施例提供的利用共享图片生成文件的方法作进一步详细描述。
步骤S11中,所述拍摄图片的图片特征信息包括图片位置信息、用户输入信息和图片内容信息中的一种或多种。具有相同位置信息的图片被初步认定为同一文件的图片,当然也有可能在相同位置具有不同的文件被拍摄上传,例如相同建筑的不同房间,这时需要结合其他图片特征信息来判断。其中,所述用户输入信息例如为用户上传图片时输入的会议地点、会议名称、会议内容等,不同用户输入信息相同,则认为他们是在拍摄相同的文件,需要放入同一候选集中。所述图片内容信息例如为文件的题目等,还可以可根据预先训练的识别模型对用户上传的图片特征进行识别,若不同用户拍摄的图片特征近似,则认为他们在拍摄相同的文件,同时同一用户在设定时间内拍摄上传的图片视为同一文件的图片(超过设定时候后可能其又去拍摄别的文件),将其放入同一候选集中,因此将全部不同用户拍摄的相同文件图片都放入同一候选集中。
步骤S12中,通过对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以对所述拍摄图片进行筛选,从而可使得所述目标图片集中的目标图片均为清晰、完整的目标图片。
具体的,如图2所示,所述对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,形成目标图片集的方法可包括如下步骤:
S121,对同一所述候选集中的多个所述拍摄图片的图片内容进行识别, 以选择出边缘完整度和清晰度满足条件的所述拍摄图片;
S122,对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中;以及,
S123,利用预设选择规则,从每一所述子候选集中选择一所述拍摄图片作为目标图片纳入目标图片集中。
可选的,步骤S121中,对同一所述候选集中的多个所述拍摄图片的图片内容进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片的方法包括:
利用边缘识别模型,对各所述拍摄图片的边缘完整度进行识别,以选择出边缘完整度满足条件的所述拍摄图片;
利用模糊度识别模型,对边缘完整度满足条件的所述拍摄图片的清晰度进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片。
其中,可通过定义边缘完整度阈值及模糊度阈值,分别用以判断所述拍摄图片的边缘完整度和清晰度是否满足条件。具体的,利用边缘识别模型,对各所述拍摄图片的边缘完整度进行识别,若所述拍摄图片的边缘完整度大于所述边缘完整度阈值,则判断为所述拍摄图片的边缘完整度满足条件,利用模糊度识别模型,对各所述拍摄图片的模糊值进行识别,若所述拍摄图片的模糊值小于所述模糊度阈值,则判断为所述拍摄图片的清晰度满足条件。另外,较佳的,设置所述边缘完整度阈值为一相对较低值,以避免例如因为遮挡造成所有拍摄的图片均为不完整的图片而没被选取,最终导致获取的目标文件缺页。同样的,在进行模糊度阈值的设置时,也可设置一相对较高值,以避免例如因为拍摄场景问题(例如光线较暗)造成所有拍摄的图片均为不十分清楚的图片而没被选取,最终导致获取的目标文件缺页。本实施例中,在进行拍摄图片的筛选时,在图片完整度和图片清晰度之间,优先考虑图片完整度,以保证至少文件内容不缺失,同时提高筛选速率。在另外一些实施例中,也可利用边缘识别模型识别出所有边缘完整度满足条件的所述拍摄图片,同时,利用模糊度识别模型识别出所有清晰度满足条件的所述拍摄图片,而后,对于同一图片内容的所有所述拍摄图片,在其中选择出完整度和清晰 度均满足条件的所述拍摄图片。
所述边缘识别模型可通过预先训练得到,在利用边缘识别模型识别图片中ppt区域的边缘时,若存在缺失部分,则可以根据已识别的边缘线条将缺失部分补全,而后可根据已识别面积和完整面积的比值来判断,或者已识别边缘线条的长度占据整体边缘线条的长度比值来判断边缘完整度。
所述模糊度识别模型例如可采用OpenCV和拉普拉斯算子来计算图片方差,亦即拉普拉斯方差算法(Variance of the Laplacian),将图片中的某一通道(一般采用灰度值)用拉普拉斯掩模做卷积运算,然后计算方差(即标准差的平方),图片方差越大,图片清晰度越高。如果图片方差小于预先定义的阈值,那么该图片就可以被认为清晰度不满足条件;如果图片方差高于预先定义的阈值,那么该图片就可以被认为清晰度满足条件。在实际操作时,也可采用清晰度阈值,即,将计算出来的图片方差转换为图片清晰度,当图片清晰度小于预先定义的清晰度阈值,那么该图片就被认为清晰度不满足条件,如果图片清晰度大于预先定义的清晰度阈值,那么该图片就被认为清晰度满足条件。其中,所述清晰度阈值在保证人眼能够识别图片内容的情况下,可根据实际情况人为调整。在另外一些实施例中,所述模糊度识别模型还可采用本领域技术人员所熟知的灰度方差算法、灰度差分平方和方差算法、Brenner函数等来判断拍摄图片的清晰度是否满足条件,在此不再赘述。
在另外一些实施例中,步骤S121之后,较佳的,所述对获取的多个所述拍摄图片的图片内容进行识别和/或比对,选择出目标图片形成目标图片集的方法还包括:
对于同一图片内容的所有所述拍摄图片,若完整度和清晰度均不满足条件,则向所述用户端发送完整度和清晰度不满足条件的所述拍摄图片的请求信息,以获取图片内容相同的新的拍摄图片;
若被请求的所述拍摄图片在设定时间内未发生更新,则将清晰度与满足条件的清晰度之间的差值小于设定值的所有所述拍摄图片中,完整度最高的所述拍摄图片纳入所述目标图片集中。
例如,设置完整度阈值为阈值A,图片清晰度阈值为阈值B,以及,设置最低清晰度条件:图片清晰度与阈值B之间的差值小于阈值C,若图片内容 为D的所有所述拍摄图片的完整度小于阈值A,清晰度小于阈值B,则再次向客户端发送图片内容为D的图片的请求信息,若在设定时间内所述客户端未发送图片内容为D的图片,则将清晰度与阈值B的差值小于阈值C的所有所述拍摄图片中,完整度最高的所述拍摄图片纳入所述目标图片集中。
该实施例中,通过对不满足条件的图片的再次获取,可以避免出现之后生成的目标文件缺页。故而,步骤S121中,在进行边缘完整度阈值及模糊度阈值的设置时,相对于上一实施例,可设置相对较高值,以使得初步筛选出来的图片均为质量相对较高的图片。
实际操作时,边缘完整度阈值也可根据实际情况人为调整,例如,若文件边缘区域为空白区域,或者,边缘区域的文本不影响对文件的阅读等,则可将边缘完整度的完整度阈值降低,例如为80%,85%等。即,当所述拍摄图片的边缘完整度超过80%,85%等时,即可纳入所述候选集中。
由上面描述可知,在从客户端获取图片时,相同内容的图片可能会获取多张,因此,本实施例步骤S12中,除了通过步骤S121识别图片内容,使得所述目标图片集中的目标图片均为清晰、完整的目标图片外,还通过步骤S122对获取的多个所述拍摄图片的图片内容进行比对,以保证所述目标图片集中,所述目标图片的单一性。
具体的,步骤S121中,对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中的方法包括:
利用字符识别模型,识别多个所述拍摄图片的图片内容的重复率,当重复率超过预设重复率阈值,则判定为图片内容相同,纳入同一子候选集中;
利用字符识别模型,识别多个所述拍摄图片的页码是否相同,若是,则判定为图片内容相同,纳入同一子候选集中;和/或,
利用图片特征提取模型,提取多个所述拍摄图片的图片特征值的相似度,当图片特征值的相似度达到预设相似度阈值,则判定为图片内容相同,纳入同一子候选集中。
应当理解,除了以上所列举的,其它可以用于判定图片相似度的方式,也应当在本申请的保护范围之内。
通过步骤S122,同一子候选集中可能会存在多张图片,因此,通过步骤S123对相同内容的图片进行筛选,以保证同一图片内容的多个所述拍摄图片,只有一张被选入所述目标图片集中。
步骤S123中,所述预设选择规则可包括:根据图片内容的完整度和/或清晰度对同一所述候子选集中的多个所述拍摄图片进行排名;以及,将排名最高的所述拍摄图片作为所述目标图片。实际操作时,可设置不同的选择模式,例如包括模式一、模式二和模式三,模式一表示用户更倾向于完整度最优,模式二表示用户更倾向于清晰度最优,模式三表示用户更倾向于完整度和清晰度综合效果最优。在步骤S11中,用户在输入目标文件请求信息时,同时可进行选择模式的选取,若用户选择模式一,则在本步骤,对同一拍摄图片中的多个所述拍摄图片进行排名时,按照图片内容的完整度进行排名,完整度越大,则排名越高,若用户选择模式二,则在本步骤,对同一拍摄图片中的多个所述拍摄图片进行排名时,按照图片内容的清晰度进行排名,清晰度越大,则排名越高。
步骤S13中,所述预设排序规则包括:根据图片之间拍摄时间、页码和/或标题的关联度进行排序。具体如下:
(1)所述预设排序规则可以是按照图片的拍摄时间顺序进行排序
具体流程包括:
利用时间获取模型获取用户图片的创建时间;
利用图片排列模型根据图片的创建时间,按照时间顺序对图片进行排列。
(2)所述预设排序规则还可以是按照图片中的页码进行排序
利用字符识别模型,识别共享图片集中的页码数,按照页码的升序进行排列。
(3)所述预设排序规则还可以是按照图片中内容的关联度进行排序
在具体的实施过程中,内容关联性,具体的是指与当前图片内容属于相邻的,且位于当前图片内容之后的下一内容。如大标题、小标题的顺序,或者用文字识别模型,识别图片中前后内容的衔接程度以及其它可以使目标图片保持前后连贯性的排序规则。
请参考图3,在根据标题的顺序来进行排序时,先确认一级标题,如图3 中所示1、2、3,再确认每个一级标题下面的子标题,将子标题与一级标题一一对应,例如一级标题1的子标题为1.1、1.2、1.3…,二级标题2的子标题的子标题为2.1、2.2、2.3…,二级标题3的子标题的子标题为3.1、3.2、3.3…,然后进行排序形成如图2所示的目标文件。
在另外的实施例中,所述预设排序规则,还可为以上几种排序规则的结合。例如,为了提高识别的准确性,除了通过标题之外,还可利用文字识别模型对文本中前后页内容的连贯性(上下文内容的衔接程度)进行识别。
步骤S13中,可采用以下两种实施方式来对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件。
实施方式1:
实时对当前所述目标图片集中的所有所述目标图片进行排序,并合成目标文件,在所述目标图片集更新后,利用所述目标图片集中新的所述目标图片对所述目标文件更新,以得到更新后的所述目标文件,直至在设定时间内所述目标图片集不再发生更新。
通过该实施方式1,展示给用户的始终为最新的按顺序整理好的目标文件。另外,若所述目标图片集中出现至少完整度大于当前所述目标文件中相同图片内容的新的目标图片,则利用新的所述目标图片对当前所述目标图片进行替换,以对当前所述目标文件进行更新。即,当服务器端的图片更新后,若更新后的图片的完整度大于当前文档中图片内容相同的当前所述目标图片且清晰度满足条件,则对图片进行更换,以对目标文件进行更新。
实施方式2:
在所述目标图片集中出现结束页,或在设定时间内所述目标图片集不再发生更新后,对所述目标图片集中的所有所述目标图片进行排序,并合成目标文件。
即,与目标文件相匹配的所有图片更新完成之后,在目标图片集中查找不重复且质量最好的图片按照预设排序进行排序,而后合成。
可根据字符识别模型识别各所述目标图片是否存在结束标识字符,以判断所述目标图片是否为结束页。所述结束标识字符例如为:谢谢、联系方式、感谢聆听等表示结束的字符。
除了上述步骤S11~S13,本实例中,所述利用共享图片生成文件的方法还可包括:利用预设输出模板将合成的所述目标文件输出,和/或,对合成的所述目标文件进行修正。
所述修正包括:图片的旋转、缩放、移动、翻转、阴影、背景、标注等的去除以及倾斜度的修正等等。较佳的,所述预设输出模板为可编辑模板,包括至少一个模板处理区域,每个模板处理区域可以填充文本、填充图片,添加批注等,从而可将获取到的字符信息、图片等按照位置填入相应的模板页面。在另外一些实施例中,所述预设输出模板也可为不可编辑模板,本申请对此不作限制。
步骤S13中,较佳的,在对所有所述目标图片进行排序后,先将排序结果显示给用户端,待用户端确认之后,再根据排序结果合成目标文件。
本实施例中,所述目标文件的输出形式可以是PDF文本,Word文本、Txt文本、PPT文本或者其它输出形式,且较佳的,与目标文件原本展示的形式相匹配,例如,若是对PPT文本进行拍照,则最终合成的目标文件以PPT形式输出,若是对word文本进行拍照,则最终合成的目标文件以word形式输出。
如前所述,本发明实施例的利用共享图片生成文件的方法可应用于本发明实施例的服务器端。具体的,本实施例提供的所述服务器端包括:处理器和存储器,所述存储器上存储有计算机程序,所述计算机程序被所述处理器执行时,实现如本实施例所述的利用共享图片生成文件的方法。
所述处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器是所述电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分。
所述存储器可用于存储所述计算机程序,所述处理器通过运行或执行存储在所述存储器内的计算机程序,以及调用存储在存储器内的数据,实现所 述服务器端的各种功能。
所述存储器可以包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
除了处理器和存储器,所述服务器端还可包括用户接口、网络接口及通信总线。用户接口用于接收用户输入的信息,例如触摸屏、摄像装置等。网络接口用于服务器端与外部进行互相通信。网络接口主要包括有线接口和无线接口,例如RS232模块、射频模块、WIFI模块等等。通信总线用于服务器端中各组成部件之间的通信,通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。
可选的,所述服务器端具有分享入口,所述分享入口用于将合成的所述目标文件分享至公共平台或其它终端。即,服务器端生成合成文本之后,用户可以将生成的合成文本通过分享入口分享至指定的分享平台中,供其他用户浏览。同时,该用户还可以将合成的文本通过分享入口分享至指定的终端。
本实施例还提供一种可读存储介质,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现本实施例提供的利用共享图片生成文件的方法。
所述可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备,例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静 态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备以及上述的任意合适的组合。
综上所述,本发明提供的利用共享图片生成文件的方法、服务器端及可读存储介质,接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中;对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,形成目标图片集;以及,根据预设排序规则,对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果,合成目标文件。即,本发明基于多方用户所拍照上传的图片合成目标文件,如此,便可解决在利用拍照对文件进行记录时,拍摄的照片缺少连贯性,文件的内容展示不完整的问题。
此外还应该认识到,虽然本发明已以较佳实施例披露如上,然而上述实施例并非用以限定本发明。对于任何熟悉本领域的技术人员而言,在不脱离本发明技术方案范围情况下,都可利用上述揭示的技术内容对本发明技术方案做出许多可能的变动和修饰,或修改为等同变化的等效实施例。因此,凡是未脱离本发明技术方案的内容,依据本发明的技术实质对以上实施例所做的任何简单修改、等同变化及修饰,均仍属于本发明技术方案保护的范围。
Claims (17)
- 一种利用共享图片生成文件的方法,其特征在于,包括:接收来自用户端的拍摄图片,并根据接收的所述拍摄图片的图片特征信息对所有所述拍摄图片进行分类,以将属于同一文件的所述拍摄图片存储在同一候选集中;对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片形成目标图片集;以及,根据预设排序规则,对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述对同一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以从所述候选集中选出目标图片,形成目标图片集的方法包括:对同一所述候选集中的多个所述拍摄图片的图片内容进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片;对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中;以及,利用预设选择规则,从每一所述子候选集中选择一所述拍摄图片作为目标图片纳入目标图片集中。
- 如权利要求2所述的利用共享图片生成文件的方法,其特征在于,所述对同一所述候选集中的多个所述拍摄图片的图片内容进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片的方法包括:利用边缘识别模型,对各所述拍摄图片的边缘完整度进行识别,以选择出边缘完整度满足条件的所述拍摄图片;利用模糊度识别模型,对边缘完整度满足条件的所述拍摄图片的清晰度进行识别,以选择出边缘完整度和清晰度满足条件的所述拍摄图片。
- 如权利要求3所述的利用共享图片生成文件的方法,其特征在于,所述对一所述候选集中的多个所述拍摄图片的图片内容进行识别和/或比对,以 从所述候选集中选出目标图片,相应形成目标图片集的方法还包括:对于同一图片内容的所有所述拍摄图片,若完整度和清晰度均不满足条件,则向所述用户端发送完整度和清晰度不满足条件的所述拍摄图片的请求信息,以获取图片内容相同的新的拍摄图片;若被请求的所述拍摄图片在设定时间内未发生更新,则将清晰度与满足条件的清晰度之间的差值小于设定值的所有所述拍摄图片中,完整度最高的所述拍摄图片纳入所述目标图片集中。
- 如权利要求2所述的利用共享图片生成文件的方法,其特征在于,所述对边缘完整度和清晰度满足条件的多个所述拍摄图片的图片内容进行比对,以将图片内容不同的所述拍摄图片纳入所述候选集的不同子候选集中的方法包括:利用字符识别模型,识别多个所述拍摄图片的图片内容的重复率,当重复率超过预设重复率阈值,则判定为图片内容相同,纳入同一子候选集中;利用字符识别模型,识别多个所述拍摄图片的页码是否相同,若是,则判定为图片内容相同,纳入同一子候选集中;和/或,利用图片特征提取模型,提取多个所述拍摄图片的图片特征值的相似度,当图片特征值的相似度达到预设相似度阈值,则判定为图片内容相同,纳入同一子候选集中。
- 如权利要求2所述的利用共享图片生成文件的方法,其特征在于,所述预设选择规则包括:根据图片内容的完整度和/或清晰度对同一所述子候选集中的多个所述拍摄图片进行排名;以及,将排名最高的所述拍摄图片作为所述目标图片。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述预设排序规则包括:根据图片之间拍摄时间、页码和/或标题的关联度进行排序。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果,合成目标文件的方法包括:实时对当前所述目标图片集中的所有所述目标图片进行排序,并合成目标文件;在所述目标图片集更新后,利用所述目标图片集中新的所述目标图片对所述目标文件更新,以得到更新后的所述目标文件,直至在设定时间内所述目标图片集不再发生更新。
- 如权利要求8所述的利用共享图片生成文件的方法,其特征在于,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件的方法还包括:若所述目标图片集中出现至少完整度大于当前所述目标文件中相同图片内容的新的目标图片,则利用新的所述目标图片对当前所述目标图片进行替换,以对当前所述目标文件进行更新。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述对所述目标图片集中的所有所述目标图片进行排序,并根据排序结果合成目标文件的方法包括:在所述目标图片集中出现结束页,或在设定时间内所述目标图片集不再发生更新后,对所述目标图片集中的所有所述目标图片进行排序,并合成目标文件。
- 如权利要求10所述的利用共享图片生成文件的方法,其特征在于,根据字符识别模型识别各所述目标图片中是否存在结束标识字符,以判断所述目标图片是否为结束页。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述拍摄图片的图片特征信息包括:图片位置信息、用户输入信息和图片内容信息中的一种或多种。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,所述利用共享图片生成文件的方法还包括:利用预设输出模板将合成的所述目标文件输出,和/或,对合成的所述目标文件进行修正。
- 如权利要求1所述的利用共享图片生成文件的方法,其特征在于,在对所有所述目标图片进行排序后,所述利用共享图片生成文件的方法还包 括:将排序结果显示给用户端,待所述用户端确认之后,再根据所述排序结果合成所述目标文件。
- 一种服务器端,其特征在于,包括处理器和存储器,所述存储器上存储有计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至14任一项所述的利用共享图片生成文件的方法。
- 如权利要求15所述的服务器端,其特征在于,所述服务器端具有分享入口,所述分享入口用于将合成的所述目标文件分享至公共平台或其它终端。
- 一种可读存储介质,其特征在于,所述可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现如权利要求1至14任一项所述的利用共享图片生成文件的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110070452.7A CN112784085A (zh) | 2021-01-19 | 2021-01-19 | 利用共享图片生成文件的方法、服务器端及可读存储介质 |
CN202110070452.7 | 2021-01-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022156538A1 true WO2022156538A1 (zh) | 2022-07-28 |
Family
ID=75757694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/070348 WO2022156538A1 (zh) | 2021-01-19 | 2022-01-05 | 利用共享图片生成文件的方法、服务器端及可读存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112784085A (zh) |
WO (1) | WO2022156538A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117848369A (zh) * | 2023-12-28 | 2024-04-09 | 四川九正鼎盛城乡建设集团有限公司 | 旅游公路交互式导览方法及系统 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112784085A (zh) * | 2021-01-19 | 2021-05-11 | 杭州睿胜软件有限公司 | 利用共享图片生成文件的方法、服务器端及可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199933A (zh) * | 2014-09-04 | 2014-12-10 | 华中科技大学 | 一种多模态信息融合的足球视频事件检测与语义标注方法 |
CN108573070A (zh) * | 2018-05-08 | 2018-09-25 | 深圳市万普拉斯科技有限公司 | 图片识别整理方法、装置和图片文件夹建立方法 |
WO2020112738A1 (en) * | 2018-11-26 | 2020-06-04 | Photo Butler Inc. | Presentation file generation |
CN112784085A (zh) * | 2021-01-19 | 2021-05-11 | 杭州睿胜软件有限公司 | 利用共享图片生成文件的方法、服务器端及可读存储介质 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325219B (zh) * | 2018-08-24 | 2023-04-07 | 维沃移动通信有限公司 | 一种生成记录文档的方法、装置及系统 |
CN109492206A (zh) * | 2018-10-10 | 2019-03-19 | 深圳市容会科技有限公司 | Ppt演示文档录制方法、装置、计算机设备和存储介质 |
-
2021
- 2021-01-19 CN CN202110070452.7A patent/CN112784085A/zh active Pending
-
2022
- 2022-01-05 WO PCT/CN2022/070348 patent/WO2022156538A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199933A (zh) * | 2014-09-04 | 2014-12-10 | 华中科技大学 | 一种多模态信息融合的足球视频事件检测与语义标注方法 |
CN108573070A (zh) * | 2018-05-08 | 2018-09-25 | 深圳市万普拉斯科技有限公司 | 图片识别整理方法、装置和图片文件夹建立方法 |
WO2020112738A1 (en) * | 2018-11-26 | 2020-06-04 | Photo Butler Inc. | Presentation file generation |
CN112784085A (zh) * | 2021-01-19 | 2021-05-11 | 杭州睿胜软件有限公司 | 利用共享图片生成文件的方法、服务器端及可读存储介质 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117848369A (zh) * | 2023-12-28 | 2024-04-09 | 四川九正鼎盛城乡建设集团有限公司 | 旅游公路交互式导览方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN112784085A (zh) | 2021-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022156538A1 (zh) | 利用共享图片生成文件的方法、服务器端及可读存储介质 | |
CN108710847B (zh) | 场景识别方法、装置及电子设备 | |
CN111062871B (zh) | 一种图像处理方法、装置、计算机设备及可读存储介质 | |
JP6267224B2 (ja) | 最良の写真を検出及び選択する方法及びシステム | |
US20200175062A1 (en) | Image retrieval method and apparatus, and electronic device | |
US8917943B2 (en) | Determining image-based product from digital image collection | |
CN108401112B (zh) | 图像处理方法、装置、终端及存储介质 | |
US9299004B2 (en) | Image foreground detection | |
US10019823B2 (en) | Combined composition and change-based models for image cropping | |
EP2312462A1 (en) | Systems and methods for summarizing photos based on photo information and user preference | |
US10803348B2 (en) | Hybrid-based image clustering method and server for operating the same | |
Kumar et al. | A dataset for quality assessment of camera captured document images | |
US20130050747A1 (en) | Automated photo-product specification method | |
WO2022111072A1 (zh) | 图像拍摄方法、装置、电子设备、服务器及存储介质 | |
US20180025215A1 (en) | Anonymous live image search | |
WO2022100690A1 (zh) | 动物脸风格图像生成方法、模型训练方法、装置和设备 | |
EP3565243A1 (en) | Method and apparatus for generating shot information | |
US20140029854A1 (en) | Metadata supersets for matching images | |
CN106874922B (zh) | 一种确定业务参数的方法及装置 | |
CN110929063A (zh) | 相册生成方法、终端设备及计算机可读存储介质 | |
CN112036342B (zh) | 单证抓拍方法、设备及计算机存储介质 | |
US20240045992A1 (en) | Method and electronic device for removing sensitive information from image data | |
CN111353063B (zh) | 图片显示方法、装置及存储介质 | |
CN110765435B (zh) | 确定人员身份属性的方法、装置和电子设备 | |
US20180189602A1 (en) | Method of and system for determining and selecting media representing event diversity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22742017 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22742017 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22742017 Country of ref document: EP Kind code of ref document: A1 |