WO2009081791A1 - 情報処理システム、その方法及びプログラム - Google Patents
情報処理システム、その方法及びプログラム Download PDFInfo
- Publication number
- WO2009081791A1 WO2009081791A1 PCT/JP2008/072824 JP2008072824W WO2009081791A1 WO 2009081791 A1 WO2009081791 A1 WO 2009081791A1 JP 2008072824 W JP2008072824 W JP 2008072824W WO 2009081791 A1 WO2009081791 A1 WO 2009081791A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- area
- text
- region
- objects
- document
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
Definitions
- the present invention relates to an information processing system, a method thereof, and a program, and in particular, identifies and classifies character regions and regions other than characters (chart regions) such as diagram regions and table regions for a document in which diagrams and characters are mixed.
- the present invention relates to a document image layout analysis technique that can perform region segmentation.
- Patent Document 1 An example of a related document image layout analysis system is described in Patent Document 1.
- This related document image layout analysis system is composed of basic line extraction means and line / stage mutual extraction means.
- the related document image layout analysis system having such a configuration operates as follows.
- a set of basic elements constituting a document such as a black pixel connected component in the document image or a circumscribed rectangle overlapping rectangle of the black pixel connected component in the document image is input.
- the row / stage mutual extraction means Based on the proximity of the base elements (the character components are relatively densely arranged) and the homogeneity (the size of the character components are approximately the same) to generate a line, Next, the row / stage mutual extraction means generates a stage by integrating the set of rows based on their proximity and homogeneity.
- Patent Document 2 An example of another related document image layout analysis system is described in Patent Document 2.
- This related document image layout analysis system includes an area extraction unit, an image generation unit, a feature calculation unit, and a distance calculation unit.
- the related document image layout analysis system having such a configuration operates as follows.
- the area extraction unit analyzes the document image to extract the text area, the chart area, and the background area
- the image generation section extracts the extracted background area in the background designation color and the text area in the text designation color.
- An image is generated from a document in which an area is filled with a chart-designated color
- the feature calculation unit displays layout characteristics indicating the proportion of the background area, text area, and chart area in the generated image, and hiragana and katakana that occupy the text area.
- the distance that is the similarity between the layout features of the document image with the layout and the search target document image, the distance that is the similarity between the text features, and the distance that is the similarity between the image features It computes, and outputs the distances ascending order in the document image.
- Japanese Patent Laid-Open No. 11-219407 page 6-9, FIGS. 1 and 9) JP 2006-318219 A (page 4-5, FIG. 1)
- the first problem is that documents that are described in various character sizes in a single document or documents that have a complex layout cannot be handled.
- the reason for this is that the layout of a presentation document, etc. is complex and diverse, and when text blocks are arranged in an intricate manner, or when text blocks and diagrams are arranged in an intricate manner, the This is because the stage cannot be extracted, and the text area is overintegrated or overdivided.
- the second problem is that a similar document search based on the arrangement of the text area and the image area cannot be performed.
- the reason is that a similar document search is performed by calculating a distance between feature amounts indicating a ratio between a text area and an image area in a document image.
- An object of the present invention is to provide an information processing system, a method, and a program for dividing a document into a text area and a chart area, which are a human appearance.
- the present invention that solves the above-described problems forms an object constituting a document and an object constituting a text region by using at least an area histogram of the object including the text extracted from the electronic document or the document image.
- An information processing system having an object classification means for classifying an object into an object.
- the present invention that solves the above-described problems forms an object constituting a document and an object constituting a text region by using at least an area histogram of the object including the text extracted from the electronic document or the document image.
- An information processing method characterized by having an object classification process for classifying objects.
- the present invention that solves the above-described problems forms an object constituting a document and an object constituting a text region by using at least an area histogram of the object including the text extracted from the electronic document or the document image.
- a program that causes an information processing apparatus to execute an object classification process for classifying an object.
- the area can be appropriately divided into a text area and a chart area.
- FIG. 1 is a block diagram showing the configuration of the first embodiment.
- FIG. 2 is a flowchart showing the operation of the embodiment of the first invention.
- FIG. 3 is a flowchart showing details of the operation (step A2 in FIG. 2) of the object classification means of the first embodiment.
- FIG. 4 is a diagram showing an example of object classification using an object area histogram.
- FIG. 5 is a diagram showing another example of object classification using an object area histogram.
- FIG. 6 is a flowchart showing details of the operations (step A3 in FIG. 2) of the text area generating means and the chart area generating means of the first embodiment.
- FIG. 7 is a diagram illustrating an example of an integration process of objects that overlap each other.
- FIG. 8 is a diagram for explaining the visual impression distance.
- FIG. 8 is a diagram for explaining the visual impression distance.
- FIG. 9 is a diagram illustrating the operation of the object integration process using the visual impression distance.
- FIG. 10 is a diagram illustrating a specific example of the object integration process using the visual impression distance.
- FIG. 11 is a diagram for explaining the visual impression distance.
- FIG. 12 is a diagram for explaining the visual impression distance.
- FIG. 13 is a diagram illustrating an example of area information.
- FIG. 14 is a block diagram showing the configuration of the second embodiment.
- FIG. 15 is a flowchart showing the operation of the second embodiment.
- FIG. 16 is a diagram showing an example of a query input screen regarding the layout of the area.
- FIG. 17 is a diagram illustrating a specific example of the integration process using the visual impression distance of the region input as a query.
- FIG. 17 is a diagram illustrating a specific example of the integration process using the visual impression distance of the region input as a query.
- FIG. 18 is a diagram illustrating an example of a formula for calculating the region similarity.
- FIG. 19 is a schematic diagram showing the association between an area input as a query and an area of a divided document.
- FIG. 20 is a diagram showing an example of a formula for calculating the overall similarity using the average value of the region similarity.
- FIG. 21 is a diagram showing an example of a query input screen based on a combination of area layout and keywords.
- an information processing system 100 includes an object extraction unit 110, an object classification unit 120, a text area generation unit 130, a chart area generation unit 140, and area information. And generating means 150.
- the object extracting unit 110 analyzes an electronic document or a document image and extracts an object included in the document.
- the object refers to a character, a line, a text block composed of a plurality of characters or lines, a figure, a table, a graph, an image, and the like.
- Related techniques relating to object extraction from document images include threshold processing, labeling processing, edge processing, and the like. In the present invention, these related techniques are also used to extract objects from document images.
- presentation creation software for example, PowerPoint (registered trademark) of Microsoft (registered trademark)
- the data file is analyzed to extract an object. In the present embodiment, the latter case will be described below.
- the object classification unit 120 classifies the object extracted by the object extraction unit 110 into an object constituting a text area and an object constituting a chart area based on the area histogram of the object including the text.
- the text area generation unit 130 performs an integration process of objects classified as objects constituting the text area by the object classification unit 120 based on the visual impression distance, and generates a text area composed of a plurality of objects.
- the chart area generation unit 140 performs an integration process of the objects classified as objects constituting the chart area by the object classification unit 120 based on the visual impression distance, and generates a chart area composed of a plurality of objects.
- the region information generation unit 150 generates region information representing each region generated by the text region generation unit 130 and the chart region generation unit 140.
- the electronic document given from the input device (not shown) is supplied to the object extraction means 110.
- the object extraction unit 110 extracts objects such as text blocks, diagrams, tables, graphs, and images included in the document by using a function prepared by the presentation creation software or analyzing an electronic document data file. To do. At this time, a minimum circumscribed rectangle (Minimum Bounding Rectangle; MBR) composed of sides parallel to the x axis and the y axis is generated for each object extracted at the same time (step A1 in FIG. 2).
- MBR Minimum Bounding Rectangle
- the object classification unit 120 classifies the objects extracted by the object extraction unit 110 into an object constituting a text area and an object constituting a chart area based on the area histogram of the object including the text (step A2). .
- the object is classified into an object (text block) containing text and an object (figure, table, graph, image) not containing text (step A2-1).
- objects that do not include text are classified as objects that constitute a chart area.
- the text block since the text block may be an object constituting the chart area, the text block is classified into an object constituting the text area and an object constituting the chart area.
- a histogram of the object area for each page that is, one slide of the presentation) is generated (step A2-2).
- the text blocks that make up the text area are described in a natural sentence with a certain amount of content in one block, so the number contained in one slide is small, and the characters in the block are large in size and large in number. There is a feature.
- the text blocks that make up the chart area describe one word or one phrase in one block, so there are many numbers in one slide, and the characters in the block are small in size and the number of characters is small. There are few features.
- the text blocks constituting the text area have a large area and a low appearance frequency
- the text blocks constituting the chart area have a small area and a high appearance frequency. Therefore, as shown in FIG. 4, an area histogram is generated by determining the MBR area of each text block, an object having an area larger than the mode area is set as an object constituting the text area, and the mode area is set.
- the following objects are classified as objects constituting the chart area (step A2-3). However, if all objects included in one slide are objects that contain text as a result of the classification into objects that contain text and objects that do not contain text, all these objects constitute the text area. Classify as an object. In the above example, an object equal to the area of the mode value is classified as an object constituting the chart area. However, the present invention is not limited to this, and an object equal to the area of the mode value is within the scope of the invention. May be classified as objects constituting the text area.
- step A2-1 to step A2-3 the object is classified into an object constituting a text area and an object constituting a chart area (steps A2-4 and A2-5).
- an object having an area larger than the area of the mode value and having an area equal to or higher than the increased area may be classified as an object constituting the text region.
- the object classification means 120 integrates and classifies the objects classified into two types, those constituting the text area and those constituting the chart area, to generate a text area and a chart area (step A3).
- the text area generating means 130 integrates objects having overlapping with respect to the MBR of each object classified as an object constituting the text area, and generates a new MBR (step A3-1).
- FIG. 7 An example of this integration process is shown in FIG. In FIG. 7, two objects having an overlap at the top of the document are integrated into one. Next, even objects that do not overlap are considered to be objects having contents that are related to each other, so it is necessary to further integrate these objects that are visually close to each other. For this reason, in the present invention, a distance between objects in consideration of human visual impression (hereinafter referred to as visual impression distance) is calculated (step A3-2).
- the visual impression distance is calculated for a combination of all two objects, and a text region is generated by integrating objects whose values are equal to or less than a threshold value (step A3). -3).
- the visual impression distance is such that the closer the distance between the two opposing MBR sides of the two objects is, and the longer the overlapping length is when the two sides are projected onto an axis parallel to the side, "Close”.
- FIG. 8 shows an example of the calculation of the visual impression distance D (A, B) between the MBR of the object A and the MBR of the object B.
- D visual impression distance
- the distance of the object is calculated using this visual impression distance.
- the visual impression distance of the object having an overlap in the x-axis direction is calculated, and the objects whose visual impression distance is equal to or smaller than the threshold (the visual impression distance is close) are integrated.
- the visual impression distance of an object having an overlap in the y-axis direction is calculated, and the objects whose visual impression distance is equal to or smaller than the threshold (the visual impression distance is close) are integrated.
- the objects integrated in the x-axis direction and the y-axis direction are finally integrated.
- FIG. 10 An example of integration processing based on visual impression distance is shown in FIG.
- MBRs are generated as a result of integrating the overlapping objects in step A3-1.
- MBR3 and MBR5 MBR4 and MBR5 are integrated in the x-axis direction.
- MBR1 and MBR2, and MBR3 and MBR4 are integrated.
- MBR1 and MBR2, MBR3, MBR4 and MBR5 are finally integrated by superimposing the integration results in the x-axis direction and the y-axis direction.
- the threshold for MBR integration based on visual impression distance for example, an average value of distances of all combinations of arbitrary two MBRs included in one slide may be used. Also, a fixed value may be given in advance.
- a text area is generated by the above processing.
- the chart area generation means 140 performs the processing shown in the flowchart of FIG. 6 for the MBR of each object classified as an object constituting the chart area. Thereby, a chart area is generated.
- the chart area is generated by the chart area generator 140.
- a text area may be generated by the text area generating means 130.
- the visual impression distance calculation formula of the present embodiment in the distance calculation in the object integration processing, it is possible to calculate as a relative distance instead of an absolute distance between objects, and a plurality of objects are enlarged / reduced In this case, the same value can be calculated (see FIG. 11). For this reason, it is possible to determine the distance by calculating the distance according to the ratio of the area of the object and the blank area, regardless of the absolute size of the object and the blank area existing between the objects.
- the visual impression distance may be defined as shown in FIG.
- the length of the MBR of the object A in the y-axis direction is A y
- the length of the MBR of the object B in the y-axis direction is B y
- the MBRs of the two objects in the y-axis direction The distance when the distance dy (A, B), the two sides of the MBR of the object A and the MBR of the object B are projected on the x-axis parallel to the side is the join x (A, B), and the length of the object A
- the length of the MBR of the object A in the x-axis direction is A x
- the length of the MBR of the object B in the x-axis direction is B x
- the distance in the x-axis direction of the opposite sides of the MBR of the two objects is d x (A, B)
- the length when the two sides of the MBR of the object A and the MBR of the object B are projected onto the y-axis parallel to the side is join y (A, B)
- the two objects having a larger area of the object with respect to the distance and a larger ratio of the overlapping portions are calculated as being closer to each other.
- the area information generation means 150 generates area information representing these areas from the text area and the chart area generated by the text area generation means 130 and the chart area generation means 140 (step A4).
- FIG. 13 shows an example of area information.
- the area information includes a document ID, a slide ID, and an MBR coordinate, an area type, a barycentric coordinate, an area, and an aspect ratio of each area.
- an object which is a component of a document is classified into objects constituting a text area and a chart area, and the objects are integrated. Can be properly divided into a text area and a chart area. For this reason, it is possible to accurately and efficiently perform processing corresponding to a region, such as extraction of only a text region or a diagram region from a document, and further, for example, character recognition processing is performed only on a text region.
- the second embodiment provides an information processing system capable of searching for similar documents based on the arrangement of text areas and image areas, and a method and program thereof.
- Information processing system 100 includes object extraction means 110, object classification means 120, and text area generation means. 130, a chart region generation unit 140, a region information generation unit 150, a region information storage unit 160, a region information conversion unit 170, and a similarity calculation unit 180.
- the object extraction means 110, the object classification means 120, the text area generation means 130, the chart area generation means 140, and the area information generation means 150 are the same as the configuration of the first embodiment shown in FIG. Since it is the same, description is abbreviate
- the region information storage unit 160 stores the region information of the electronic document and document image output from the region information generation unit 150.
- the area information conversion means 170 converts a search query related to the position and size of a text area and a chart area of a document into area information.
- the query is an item input by the user for document search.
- the similarity calculation unit 180 compares and collates the region information stored in the region information storage unit 160 with the region information output from the region information conversion unit 170, calculates similarity, and searches for similar documents.
- the electronic document and the document image are divided into areas in advance, and the area information is stored in the area information storage means 160.
- FIG. 16 is an example of a query input screen 200 for the layout of slides included in a document.
- a user inputs a slide layout using an input unit such as a keyboard or a mouse through a screen displayed on an output unit (not shown) such as a display connected to the computer 100.
- the user first selects either the text area or the chart area by the area selection unit 210.
- the layout input unit 220 when a rectangle is designated by mouse drag or the like, a rectangular region corresponding to the region type selected by the region selection unit 210 is drawn. Alternatively, a drawn rectangle may be selected with a mouse or the like, and the position of the rectangle may be moved, the shape may be changed, or the size may be enlarged / reduced. In the example of FIG. 16, a text area is specified at the top of the slide, and a chart area is specified at the bottom of the slide.
- the search button 230 is pressed, a document search based on the layout designated by the layout input unit 220 is started.
- the clear button 240 is pressed, the rectangle drawn in the layout input unit 220 is deleted, and the layout input can be performed again.
- the region information conversion unit 170 When the search button 230 is pressed, first, the region information conversion unit 170 generates a search query related to the position and size of the text region and chart region specified by the layout input unit 220. Thus, it is converted into the same area information stored in the area information storage means 160 (step B2). At this time, if a plurality of areas of the same area type are designated in the area designated by the user in step B1, the visual rectangles shown in steps A3-2 and A3-3 in the flowchart of FIG. After performing region integration processing using distance, it is converted into region information. For example, in the example shown in FIG. 17, two text areas and two chart areas are integrated into one text area and one chart area, respectively, as a result of the area integration process using the visual rectangular distance. Has been. In addition, the user may be able to select whether or not to perform region integration processing using the visual rectangular distance.
- the similarity calculation unit 180 compares the region information converted from the query related to the layout input by the user by the region information conversion unit 170 with the region information for each document stored in the region information storage unit 160. Thus, the similarity between the layout of the area input by the user and the layout of the area of the divided document is calculated (step B3).
- the similarity is, for example, the average value of the region similarity that is the similarity of each corresponding region.
- a formula for calculating the region similarity for example, for a region having the same region type (text region or coordinate region), a cosine scale based on an angle ⁇ formed by a feature vector obtained from the region information is used.
- the feature vector is represented by the four-dimensional vector of the center of gravity x coordinate v1, the center of gravity y coordinate v2, the area v3, and the aspect ratio v4 from the region information shown in FIG. 13, the region converted from the query input by the user
- the similarity sim (Q, Ri) using the cosine measure of the feature vector Q and the feature vector Ri of the region stored in the region information storage means 160 can be obtained as shown in FIG.
- the similarity calculation unit 180 calculates the region similarity for all the combinations included in the region information for each document for each region included in the region information converted from the query, as shown in FIG.
- the region having the maximum similarity is associated as the region corresponding to the region converted from the query, and the value is defined as the region similarity between the two regions.
- the similarity calculation unit 180 identifies slides having regions similar to the region layout input by the user in step B3, sorts them in descending order of similarity, and presents them to the user (step B4).
- keywords in the conventional keyword search may be specified at the same time.
- FIG. 21 shows an example of a query input screen 200 for designating a document layout and keywords as a search query.
- the user performs layout input in the same manner as described above, and further specifies a keyword included in the slide with the keyword input unit 260.
- the search button 230 is pressed, a document search based on the layout specified by the layout input unit 220 and the keyword specified by the keyword input unit 260 is started.
- a slide including a specified keyword can be searched using a related technique for keyword search.
- the search process combining the layout and the keyword operates so as to calculate the above-described layout similarity only for the slide searched by the keyword search.
- the layout clear button 250 and the keyword clear button 260 are pressed, the rectangle drawn in the layout input unit 220 and the keyword input in the keyword input unit 260 are deleted, and the region layout and keyword input can be performed again. it can.
- the user when the user inputs the layout of the area, the user himself / herself weights which of the text area and the chart area should be emphasized or which of the input areas should be emphasized depending on the user's confidence in the memory. You may be able to do it.
- area information generated by dividing an electronic document or document image in advance and area information generated from a query related to the layout of the area input by the user are compared and collated to have a similar layout. Since the document is configured to be searched, the document can be searched based on the arrangement of the text area and the chart area even when the keyword included in the document is not accurately remembered. That is, similar documents can be searched based on the arrangement of the text area and the image area.
- the document is searched based on the combination of the layout of the text area and the chart area and the keyword. it can.
- each component is configured by hardware, but may be realized by a computer configured by a CPU or memory.
- the object constituting the document extracted from the electronic document or the document image is composed of the object constituting the text region and the chart region using at least the area histogram of the object including the text.
- This is an information processing system having object classification means for classifying objects into objects to be classified.
- the object classification unit calculates an area histogram of the object including the text, and determines the object including the text as the text area according to the comparison with the area having the mode value.
- the object is classified into an object constituting the diagram area and an object constituting the chart area.
- the object classifying unit calculates an area histogram of an object including text, and classifies an object having an area larger than an area having a mode value as an object constituting the text area.
- the object having an area smaller than the mode and the object not including the text are classified into objects constituting the chart area.
- the object classification unit calculates an area histogram of an object including text, and selects an object having an area larger than an area that is a mode value and larger than an area whose frequency has increased again. It is configured to classify as an object constituting a text area, and to classify an object that includes the text that is not classified as an object that constitutes a text area and an object that does not contain text as an object that constitutes a chart area. Yes.
- the fifth aspect includes an object extracting means for extracting an object constituting the document from the electronic document or the document image in the above aspect.
- a sixth aspect is the above-described aspect, in which the text area generation for generating the text area by integrating the objects constituting the text area based on the visual impression distance that is the distance between the objects in consideration of the human visual impression. And a chart area generating means for generating a chart area by integrating objects constituting the chart area based on the visual impression distance, and an area information generating means for generating and outputting information representing the text area and the chart area. And have.
- the text area generation means or the chart area generation means has a minimum circumscribed rectangle formed of sides parallel to the x axis and the y axis of the objects constituting the area overlapping each other. Or, when the minimum circumscribed rectangles do not overlap each other, when two objects are projected onto the x-axis or y-axis, the distance between the opposing sides of the respective minimum circumscribed rectangles is set to D1 and face each other. D1 / D2 is calculated as the visual impression distance when the length of the overlapping portion when the side is projected onto an axis parallel to them is D2, and the value of the visual impression distance D1 / D2 is compared with the threshold value. To determine whether to integrate these two objects, and in the case of integration, the process of integrating the two objects is performed in the x-axis direction and y It is configured to generate a region by integrating the object by performing for each direction.
- the text area generation unit or the chart area generation unit includes a minimum circumscribed rectangle formed by sides parallel to the x-axis and the y-axis of the objects constituting the area.
- the distance between the opposing sides of each minimum circumscribed rectangle is D1
- the opposing sides D2 is the length of the overlapping part when projected onto the axis parallel to them
- D3 is the sum of the lengths of the two objects perpendicular to the opposite sides
- the opposite sides are the axes parallel to them.
- the text area generation unit or the chart area generation unit calculates a visual impression distance for all combinations of minimum circumscribed rectangles of any two objects included in one slide.
- the average value is set as the threshold value.
- an area information storage unit that stores area information of an electronic document and an image document, and an area for converting a query regarding the layout of the area of the electronic document and the image document input by the user into area information Information conversion means; and similarity calculation means for calculating similarity by comparing the area information stored in the area information storage means with the area information converted by the area information conversion means, A document having a layout similar to the layout of the document area input by is searched.
- the similarity calculation unit calculates a barycentric coordinate value representing the position of the region, an area representing the size of the region, and a shape of the region, for each region type of the text region and the chart region.
- the similarity is calculated by comparing the aspect ratio to be expressed.
- the similarity calculation unit is configured to calculate an angle formed by a feature vector including the x-coordinate of the centroid, the y-coordinate of the centroid, the area, and the aspect ratio in calculating the similarity. Use cosine value.
- a thirteenth aspect further includes keyword search means for searching for a document including the input keyword in the above aspect, wherein the similarity calculation means calculates the similarity only for the document searched by the keyword search means.
- a document that includes the keyword entered by the user and that has a layout similar to the layout of the document area entered by the user is retrieved.
- an object constituting a document extracted from an electronic document or a document image is converted into an object constituting a text area and an object constituting a chart area using at least an area histogram of an object including text.
- An information processing method having object classification processing for classification.
- the object classification processing calculates an area histogram of an object including text, and the object including the text is converted into a text area in accordance with a comparison with an area having a mode value.
- the object is classified into an object constituting the diagram area and an object constituting the chart area.
- the object classification processing calculates an area histogram of an object including text, and classifies an object having an area larger than an area having a mode value as an object constituting the text area. Then, an object having an area smaller than the mode value and an object not including text are classified into objects constituting the chart area.
- the object classification process calculates an area histogram of an object including text, and selects an object having an area larger than an area that is a mode value and larger than an area whose frequency has increased again.
- the object is classified as an object constituting the text area, and the object including the text and not classified as the object constituting the text area and the object not including the text are classified as objects constituting the chart area.
- the eighteenth aspect has an object extraction process for extracting an object constituting a document from an electronic document or a document image in the above aspect.
- the text area generation that generates the text area by integrating the objects constituting the text area based on the visual impression distance that is a distance between the objects in consideration of the human visual impression. Processing, a diagram area generation process for generating a chart area by integrating objects constituting the chart area based on the visual impression distance, and a region information generation process for generating and outputting information representing the text area and the chart area And have.
- the minimum circumscribed rectangles having sides parallel to the x axis and the y axis of the objects constituting the area overlap each other.
- the distance between the opposing sides of the respective minimum circumscribed rectangles is set to D1 and face each other.
- D1 / D2 is calculated as the visual impression distance when the length of the overlapping portion when the side is projected onto an axis parallel to them is D2, and the value of the visual impression distance D1 / D2 is compared with the threshold value.
- the process of integrating the two objects is defined as the x-axis direction. It generates area by integrating object by performing for each axial direction.
- minimum circumscribed rectangles having sides parallel to the x-axis and the y-axis of the objects constituting the region overlap each other.
- the distance between the opposing sides of each minimum circumscribed rectangle is D1
- the opposing sides D2 is the length of the overlapping part when projecting to the axis parallel to them
- D3 is the sum of the lengths of the two objects perpendicular to the sides facing each other
- the sides facing each other are the axes parallel to them
- the length of the projection is adjusted according to the comparison between the value of (D1 ⁇ D4) / (D2 ⁇ D3) and the threshold value. It is determined whether or not these two objects are to be integrated, and in the case of integration, a process for integrating the two objects is performed in each of the
- the text area generation process or the chart area generation process calculates a visual impression distance for all combinations of minimum circumscribed rectangles of any two objects included in one slide.
- the average value is set as the threshold value.
- a region information conversion process for converting a query regarding a layout of regions of an electronic document and an image document input by a user into region information, region information of an electronic document and an image document, and the region It further includes a similarity calculation process for calculating the similarity by comparing the area information converted by the information conversion process, and searches for a document having a layout similar to the layout of the document area input by the user.
- the similarity calculation processing includes, for each region type of the text region and the chart region, a barycentric coordinate value that represents the position of the region, an area that represents the size of the region, and a shape of the region.
- the similarity is calculated by comparing the aspect ratio to be expressed.
- the similarity is calculated by calculating an angle formed by a feature vector including the x-coordinate of the centroid, the y-coordinate of the centroid, the area, and the aspect ratio of the two regions. Use cosine value.
- a twenty-sixth aspect further includes a keyword search process for searching for a document including the input keyword in the above aspect, wherein the similarity calculation process calculates the similarity only for the document searched by the keyword search process.
- a document that includes the keyword entered by the user and that has a layout similar to the layout of the document area entered by the user is retrieved.
- an object constituting a document extracted from an electronic document or a document image is converted into an object constituting a text area and an object constituting a chart area using at least an area histogram of the object including text.
- the object classification process calculates an area histogram of an object including text, and the object including the text is converted into a text area according to a comparison with an area that is a mode value.
- the object is classified into the object constituting the object and the object constituting the chart area.
- the object classification processing calculates an area histogram of an object including text, and classifies an object having an area larger than an area that is a mode value as an object constituting a text area. Then, an object having an area smaller than the mode value and an object not including text are classified into objects constituting the chart area.
- the object classification processing calculates an area histogram of an object including text, and an object having an area larger than an area that is a mode value and larger than an area whose frequency is increased again.
- the object is classified as an object constituting the text area, and the object including the text and not classified as the object constituting the text area and the object not including the text are classified as objects constituting the chart area.
- the information processing apparatus executes an object extraction process for extracting an object constituting a document from an electronic document or a document image.
- the text area generation that generates the text area by integrating the objects constituting the text area based on the visual impression distance that is the distance between the objects in consideration of the human visual impression. Processing, a diagram area generation process for generating a chart area by integrating objects constituting the chart area based on the visual impression distance, and a region information generation process for generating and outputting information representing the text area and the chart area And have.
- the minimum circumscribed rectangles formed by sides parallel to the x axis and the y axis of the objects constituting the region overlap each other.
- the distance between the opposing sides of the respective minimum circumscribed rectangles is set to D1 and face each other.
- D1 / D2 is calculated as the visual impression distance when the length of the overlapping portion when the side is projected onto an axis parallel to them is D2, and the value of the visual impression distance D1 / D2 is compared with the threshold value.
- the process of integrating the two objects is defined as the x-axis direction. It generates area by integrating object by performing for each axial direction.
- the minimum circumscribed rectangles formed by sides parallel to the x axis and the y axis of the objects constituting the region overlap each other.
- the distance between the opposing sides of each minimum circumscribed rectangle is D1
- the opposing sides D2 is the length of the overlapping part when projecting to the axis parallel to them
- D3 is the sum of the lengths of the two objects perpendicular to the sides facing each other
- the sides facing each other are the axes parallel to them
- the length of the projection is adjusted according to the comparison between the value of (D1 ⁇ D4) / (D2 ⁇ D3) and the threshold value. It is determined whether or not these two objects are to be integrated, and in the case of integration, a process for integrating the two objects is performed in
- the text area generation process or the chart area generation process calculates visual impression distances for all combinations of minimum circumscribed rectangles of any two objects included in one slide.
- the average value is set as the threshold value.
- a region information conversion process for converting a query relating to a layout of a region of an electronic document and an image document input by a user into region information, region information of the electronic document and the image document, Search for documents with a layout similar to the layout of the area of the document entered by the user by causing the information processing device to execute similarity calculation processing that compares the region information converted by the information conversion processing and calculates similarity To do.
- the similarity calculation processing is performed by calculating a barycentric coordinate value representing the position of the region, an area representing the size of the region, and a shape of the region for each region type of the text region and the chart region.
- the similarity is calculated by comparing the aspect ratio to be expressed.
- a thirty-eighth aspect is that in the above aspect, the similarity calculation processing is performed by calculating the angle formed by the feature vector including the x-coordinate of the centroid, the y-coordinate of the centroid, the area, and the aspect ratio in the similarity calculation. Use cosine value.
- a thirty-ninth aspect is the above-described aspect, wherein the keyword search process for searching for a document including the input keyword is executed by the information processing apparatus, and the similarity calculation process is performed only on the document searched by the keyword search process. The similarity is calculated, and a document including a keyword input by the user and having a layout similar to the layout of the document area input by the user is searched.
- a document having a complicated and various layout such as a presentation document can be appropriately divided into a text area and a chart area.
- the reason for this is that the objects that make up the document are extracted, the objects are classified into the objects that make up the text elements and the objects that make up the chart area, and the objects are separated from the shape of the blank area that exists between the classified objects. This is because a text area and a chart area are generated by determining whether or not to integrate the objects and integrating the objects.
- an information extraction apparatus that extracts only a text area or only a chart area from an electronic document or a document image
- an information processing apparatus that performs processing according to the extracted area with high accuracy and efficiency, and those Can be applied to applications such as a program for realizing the above on a computer.
- an information retrieval device that retrieves a document from a database based on the layout of a text area or a chart area.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
Description
この関連する文書画像レイアウト分析システムは、基本行抽出手段と、行・段相互抽出手段とから構成されている。
110 オブジェクト抽出手段
120 オブジェクト分類手段
130 テキスト領域生成手段
140 図表領域生成手段
150 領域情報生成手段
160 領域情報格納手段
170 領域情報変換手段
180 類似度計算手段
200 クエリ入力画面
210 領域選択部
220 レイアウト入力部
230 検索ボタン
240 (レイアウト)クリアボタン
250 レイアウトクリアボタン
260 キーワード入力部
270 キーワードクリアボタン
本発明の実施の形態について図面を参照して詳細に説明する。
(1)テキスト領域を構成するテキストブロックは矩形を基本として配置される
(2)関連性の高いオブジェクトは見た目にひとかたまりになるように互いに近くに配置される
(3)それらひとかたまりのオブジェクト群がそれぞれ識別できるように間を空けて配置される
という特徴がある。
D(A,B)=d(A,B)×1/overlap(A,B)
となる。
Dy(A,B)=dy(A,B)/(Ay+By)×1/overlapx(A,B)/joinx(A,B)
=(dy(A,B)×joinx(A,B))/((Ay+By)×overlapx(A,B))
となる。
Dx(A,B)=dx(A,B)/(Ax+Bx)×1/overlapy(A,B)/joiny(A,B)
=(dx(A,B)×joiny(A,B))/((Ax+Bx)×overlapy(A,B))
となる。
<第2の実施の形態>
本発明の第2の発明を実施するための最良の形態について図面を参照して詳細に説明する。
情報処理システム100は、オブジェクト抽出手段110と、オブジェクト分類手段120と、テキスト領域生成手段130と、図表領域生成手段140と、領域情報生成手段150と、領域情報格納手段160と、領域情報変換手段170、類似度計算手段180とを含む。
類似度=((テキスト領域1とテキスト領域aとの類似度)+(図表領域2と図表領域bとの類似度)+(図表領域3と図表領域cとの類似度))/3
となる。
Claims (39)
- 電子文書又は文書画像から抽出された、文書を構成するオブジェクトを、テキストを含むオブジェクトの面積ヒストグラムを少なくとも用いて、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類するオブジェクト分類手段を有する情報処理システム。
- 前記オブジェクト分類手段は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積との比較に応じて、前記テキストを含むオブジェクトを、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類する、請求項1に記載の情報処理システム。
- 前記オブジェクト分類手段は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積よりも大きな面積を持つオブジェクトを、テキスト領域を構成するオブジェクトに分類し、最頻値よりも小さい面積を持つオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトに分類するように構成されている、請求項1又は請求項2に記載の情報処理システム。
- 前記オブジェクト分類手段は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積より大きく、かつ頻度が再上昇した面積より大きい面積を持つオブジェクトをテキスト領域を構成するオブジェクトとして分類し、前記テキストを含むオブジェクトでテキスト領域を構成するオブジェクトとして分類されなかったオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトとして分類するように構成されている、請求項1又は請求項2に記載の情報処理システム。
- 電子文書又は文書画像から文書を構成するオブジェクトを抽出するオブジェクト抽出手段を有する請求項1から請求項4のいずれかに記載の情報処理システム。
- 人間の視覚的な印象を考慮したオブジェクト間の距離である視覚印象距離に基づいて、テキスト領域を構成するオブジェクトを統合し、テキスト領域を生成するテキスト領域生成手段と、
前記視覚印象距離に基づいて、図表領域を構成するオブジェクトを統合し、図表領域を生成する図表領域生成手段と、
テキスト領域と図表領域を表す情報を生成して出力する領域情報生成手段と
を有する請求項1から請求項5のいずれかに記載の情報処理システム。 - 前記テキスト領域生成手段、又は、前記図表領域生成手段は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、または、最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2としたとき、視覚印象距離としてD1/D2を計算し、視覚印象距離D1/D2の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成するように構成されている請求項6に記載の情報処理システム。 - 前記テキスト領域生成手段、又は、前記図表領域生成手段は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、あるいは最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2とし、2つのオブジェクトの互いに向かい合う辺に垂直な辺の長さの和をD3とし、互いに向かい合う辺をそれらと平行な軸に射影したときの全体の長さをD4としたとき、(D1×D4)/(D2×D3)の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成するように構成されている、請求項6に記載の情報処理システム。 - 前記テキスト領域生成手段、又は、前記図表領域生成手段は、
1つのスライドに含まれる任意の2つのオブジェクトの最小外接矩形のすべての組み合わせについて視覚印象距離を計算し、その平均値を前記しきい値とするように構成されている、請求項6から請求項8のいずれかに記載の情報処理システム。 - 電子文書および画像文書の領域情報を格納する領域情報格納手段と、
ユーザが入力する、電子文書および画像文書の領域のレイアウトに関するクエリを領域情報に変換する領域情報変換手段と、
前記領域情報格納手段に格納された領域情報と、前記領域情報変換手段により変換された領域情報とを比較して類似度を計算する類似度計算手段とをさらに有し、
ユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項1から請求項9に記載の情報処理システム。 - 前記類似度計算手段は、テキスト領域および図表領域の領域種類別に、領域の位置を表す重心座標値と、領域の大きさを表す面積と、領域の形状を表す縦横比とを比較することにより、類似度を計算するように構成されている、請求項10に記載の情報処理システム。
- 前記類似度計算手段は、類似度の計算において、2つの領域についての重心のx座標、重心のy座標、面積、縦横比からなる特徴ベクトルのなす角のコサイン値を用いる、請求項11に記載の情報処理システム。
- 入力したキーワードを含む文書を検索するキーワード検索手段をさらに有し、
前記類似度計算手段は、前記キーワード検索手段により検索された文書に対してのみ類似度を計算し、
ユーザが入力したキーワードを含み、かつユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項10から請求項12のいずれかに記載の情報処理システム。 - 電子文書又は文書画像から抽出された、文書を構成するオブジェクトを、テキストを含むオブジェクトの面積ヒストグラムを少なくとも用いて、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類するオブジェクト分類処理を有する情報処理方法。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積との比較に応じて、前記テキストを含むオブジェクトを、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類する、請求項14に記載の情報処理方法。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積よりも大きな面積を持つオブジェクトを、テキスト領域を構成するオブジェクトに分類し、最頻値よりも小さい面積を持つオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトに分類する、請求項14又は請求項15に記載の情報処理方法。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積より大きく、かつ頻度が再上昇した面積より大きい面積を持つオブジェクトをテキスト領域を構成するオブジェクトとして分類し、前記テキストを含むオブジェクトでテキスト領域を構成するオブジェクトとして分類されなかったオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトとして分類する、請求項14又は請求項15に記載の情報処理方法。
- 電子文書又は文書画像から文書を構成するオブジェクトを抽出するオブジェクト抽出処理を有する、請求項14から請求項17のいずれかに記載の情報処理方法。
- 人間の視覚的な印象を考慮したオブジェクト間の距離である視覚印象距離に基づいて、テキスト領域を構成するオブジェクトを統合し、テキスト領域を生成するテキスト領域生成処理と、
前記視覚印象距離に基づいて、図表領域を構成するオブジェクトを統合し、図表領域を生成する図表領域生成処理と、
テキスト領域と図表領域を表す情報を生成して出力する領域情報生成処理と
を有する、請求項14から請求項18のいずれかに記載の情報処理方法。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、または、最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2としたとき、視覚印象距離としてD1/D2を計算し、視覚印象距離D1/D2の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成する、請求項19に記載の情報処理方法。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、あるいは最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2とし、2つのオブジェクトの互いに向かい合う辺に垂直な辺の長さの和をD3とし、互いに向かい合う辺をそれらと平行な軸に射影したときの全体の長さをD4としたとき、(D1×D4)/(D2×D3)の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成する、請求項19に記載の情報処理システム。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
1つのスライドに含まれる任意の2つのオブジェクトの最小外接矩形のすべての組み合わせについて視覚印象距離を計算し、その平均値を前記しきい値とする、請求項19から請求項21のいずれかに記載の情報処理方法。 - ユーザが入力する、電子文書および画像文書の領域のレイアウトに関するクエリを領域情報に変換する領域情報変換処理と、
電子文書および画像文書の領域情報と、前記領域情報変換処理により変換された領域情報とを比較して類似度を計算する類似度計算処理とをさらに有し、
ユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項14から請求項22に記載の情報処理方法。 - 前記類似度計算処理は、テキスト領域および図表領域の領域種類別に、領域の位置を表す重心座標値と、領域の大きさを表す面積と、領域の形状を表す縦横比とを比較することにより、類似度を計算する、請求項23に記載の情報処理方法。
- 前記類似度計算処理は、類似度の計算において、2つの領域についての重心のx座標、重心のy座標、面積、縦横比からなる特徴ベクトルのなす角のコサイン値を用いる、請求項24に記載の情報処理方法。
- 入力したキーワードを含む文書を検索するキーワード検索処理をさらに有し、
前記類似度計算処理は、前記キーワード検索処理により検索された文書に対してのみ類似度を計算し、
ユーザが入力したキーワードを含み、かつユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項23から請求項25のいずれかに記載の情報処理方法。 - 電子文書又は文書画像から抽出された、文書を構成するオブジェクトを、テキストを含むオブジェクトの面積ヒストグラムを少なくとも用いて、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類するオブジェクト分類処理を、情報処理装置に実行させるプログラム。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積との比較に応じて、前記テキストを含むオブジェクトを、テキスト領域を構成するオブジェクトと図表領域を構成するオブジェクトとに分類する、請求項27に記載のプログラム。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積よりも大きな面積を持つオブジェクトを、テキスト領域を構成するオブジェクトに分類し、最頻値よりも小さい面積を持つオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトに分類する、請求項27又は請求項28に記載のプログラム。
- 前記オブジェクト分類処理は、テキストを含むオブジェクトの面積ヒストグラムを計算し、最頻値となる面積より大きく、かつ頻度が再上昇した面積より大きい面積を持つオブジェクトをテキスト領域を構成するオブジェクトとして分類し、前記テキストを含むオブジェクトでテキスト領域を構成するオブジェクトとして分類されなかったオブジェクトとテキストを含まないオブジェクトとを図表領域を構成するオブジェクトとして分類する、請求項27又は請求項28に記載のプログラム。
- 電子文書又は文書画像から文書を構成するオブジェクトを抽出するオブジェクト抽出処理を、情報処理装置に実行させる、請求項27から請求項30のいずれかに記載のプログラム。
- 人間の視覚的な印象を考慮したオブジェクト間の距離である視覚印象距離に基づいて、テキスト領域を構成するオブジェクトを統合し、テキスト領域を生成するテキスト領域生成処理と、
前記視覚印象距離に基づいて、図表領域を構成するオブジェクトを統合し、図表領域を生成する図表領域生成処理と、
テキスト領域と図表領域を表す情報を生成して出力する領域情報生成処理と
を有する、請求項27から請求項31のいずれかに記載のプログラム。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、または、最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2としたとき、視覚印象距離としてD1/D2を計算し、視覚印象距離D1/D2の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成する、請求項32に記載のプログラム。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
領域を構成するオブジェクトのx軸とy軸に平行な辺からなる最小外接矩形が互いに重なりを持つ、あるいは最小外接矩形が互いに重なりを持たない場合には2つのオブジェクトをx軸あるいはy軸に射影したとき重なりを持つオブジェクトについてそれぞれの最小外接矩形の互いに向かい合う辺の距離をD1とし、互いに向かい合う辺をそれらと平行な軸に射影したときの重なる部分の長さをD2とし、2つのオブジェクトの互いに向かい合う辺に垂直な辺の長さの和をD3とし、互いに向かい合う辺をそれらと平行な軸に射影したときの全体の長さをD4としたとき、(D1×D4)/(D2×D3)の値としきい値との比較に応じてそれら2つのオブジェクトを統合するか否かを判定し、統合する場合には前記2つのオブジェクトを統合する処理をx軸方向とy軸方向それぞれについて行うことによりオブジェクトを統合して領域を生成する、請求項32に記載のプログラム。 - 前記テキスト領域生成処理、又は、前記図表領域生成処理は、
1つのスライドに含まれる任意の2つのオブジェクトの最小外接矩形のすべての組み合わせについて視覚印象距離を計算し、その平均値を前記しきい値とする、請求項32から請求項34のいずれかに記載のプログラム。 - ユーザが入力する、電子文書および画像文書の領域のレイアウトに関するクエリを領域情報に変換する領域情報変換処理と、
電子文書および画像文書の領域情報と、前記領域情報変換処理により変換された領域情報とを比較して類似度を計算する類似度計算処理とを情報処理装置に実行させ、
ユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項27から請求項35に記載のプログラム。 - 前記類似度計算処理は、テキスト領域および図表領域の領域種類別に、領域の位置を表す重心座標値と、領域の大きさを表す面積と、領域の形状を表す縦横比とを比較することにより、類似度を計算する、請求項36に記載のプログラム。
- 前記類似度計算処理は、類似度の計算において、2つの領域についての重心のx座標、重心のy座標、面積、縦横比からなる特徴ベクトルのなす角のコサイン値を用いる、請求項37に記載のプログラム。
- 入力したキーワードを含む文書を検索するキーワード検索処理を情報処理装置に実行させ、
前記類似度計算処理は、前記キーワード検索処理により検索された文書に対してのみ類似度を計算し、
ユーザが入力したキーワードを含み、かつユーザが入力した文書の領域のレイアウトに類似したレイアウトを持つ文書を検索する、
請求項36から請求項38のいずれかに記載のプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/809,256 US20110043869A1 (en) | 2007-12-21 | 2008-12-16 | Information processing system, its method and program |
JP2009547049A JPWO2009081791A1 (ja) | 2007-12-21 | 2008-12-16 | 情報処理システム、その方法及びプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2007-329475 | 2007-12-21 | ||
JP2007329475 | 2007-12-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009081791A1 true WO2009081791A1 (ja) | 2009-07-02 |
Family
ID=40801096
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2008/072824 WO2009081791A1 (ja) | 2007-12-21 | 2008-12-16 | 情報処理システム、その方法及びプログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20110043869A1 (ja) |
JP (1) | JPWO2009081791A1 (ja) |
WO (1) | WO2009081791A1 (ja) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012090264A (ja) * | 2010-10-19 | 2012-05-10 | Palo Alto Research Center Inc | 二次元ビジュアルフィンガープリントを用いるプレゼンテーション及びリッチドキュメントコンテンツの混合コレクションにおける類似コンテンツの発見方法 |
JP2013190973A (ja) * | 2012-03-13 | 2013-09-26 | Nec Corp | 文書内の図情報を利用した類似文書の検索システム及び方法 |
JP2019505038A (ja) * | 2015-12-31 | 2019-02-21 | 北京金山▲辧▼公▲軟▼件股▲ふん▼有限公司Beijing Kingsoft Office Software,Inc. | スライドを識別するための方法及び装置 |
JP2020052961A (ja) * | 2018-09-28 | 2020-04-02 | キヤノン株式会社 | コンテンツ提供システム、コンテンツ提供方法、情報処理装置、及びプログラム |
JP2022026017A (ja) * | 2020-07-30 | 2022-02-10 | 楽天グループ株式会社 | 情報処理装置、情報処理方法およびプログラム |
JP7435118B2 (ja) | 2020-03-24 | 2024-02-21 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及びプログラム |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101551859B (zh) * | 2008-03-31 | 2012-01-04 | 夏普株式会社 | 图像辨别装置及图像检索装置 |
JP4539756B2 (ja) * | 2008-04-14 | 2010-09-08 | 富士ゼロックス株式会社 | 画像処理装置及び画像処理プログラム |
US8218875B2 (en) * | 2010-06-12 | 2012-07-10 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
US8825649B2 (en) * | 2010-07-21 | 2014-09-02 | Microsoft Corporation | Smart defaults for data visualizations |
US20120284276A1 (en) * | 2011-05-02 | 2012-11-08 | Barry Fernando | Access to Annotated Digital File Via a Network |
KR101364178B1 (ko) * | 2011-06-08 | 2014-02-25 | 이해성 | 전자책 시스템과 전자책 데이터 생성, 검색 장치 및 그 방법 |
WO2012169841A2 (ko) * | 2011-06-08 | 2012-12-13 | 주식회사 내일이비즈 | 전자책 시스템과 전자책 데이터 생성, 검색 장치 및 그 방법 |
US9507782B2 (en) | 2012-08-14 | 2016-11-29 | Empire Technology Development Llc | Dynamic content preview |
KR102124601B1 (ko) * | 2013-06-21 | 2020-06-19 | 삼성전자주식회사 | 피사체의 거리를 추출하여 정보를 표시하는 전자 장치 및 방법 |
US10331732B1 (en) * | 2016-12-16 | 2019-06-25 | National Technology & Engineering Solutions Of Sandia, Llc | Information searching system |
CN110651267B (zh) | 2017-09-13 | 2023-09-19 | 谷歌有限责任公司 | 有效地增强具有相关内容的图像 |
CN108038426A (zh) * | 2017-11-29 | 2018-05-15 | 阿博茨德(北京)科技有限公司 | 一种提取文件中图表信息的方法及装置 |
CN113282779A (zh) * | 2020-02-19 | 2021-08-20 | 阿里巴巴集团控股有限公司 | 图像搜索方法、装置、设备 |
US11763586B2 (en) | 2021-08-09 | 2023-09-19 | Kyocera Document Solutions Inc. | Method and system for classifying document images |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS62165284A (ja) * | 1986-01-17 | 1987-07-21 | Hitachi Ltd | 文字列抽出方式 |
JP2000306041A (ja) * | 1999-04-23 | 2000-11-02 | Ricoh Co Ltd | 文字サイズ推定方法および記録媒体 |
JP2004030696A (ja) * | 1997-12-19 | 2004-01-29 | Fujitsu Ltd | 文字列抽出装置及びパターン抽出装置 |
JP2006155380A (ja) * | 2004-11-30 | 2006-06-15 | Canon Inc | 画像処理装置、その方法およびその制御方法 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5179599A (en) * | 1991-06-17 | 1993-01-12 | Hewlett-Packard Company | Dynamic thresholding system for documents using structural information of the documents |
US5588072A (en) * | 1993-12-22 | 1996-12-24 | Canon Kabushiki Kaisha | Method and apparatus for selecting blocks of image data from image data having both horizontally- and vertically-oriented blocks |
US6137905A (en) * | 1995-08-31 | 2000-10-24 | Canon Kabushiki Kaisha | System for discriminating document orientation |
JP3601658B2 (ja) * | 1997-12-19 | 2004-12-15 | 富士通株式会社 | 文字列抽出装置及びパターン抽出装置 |
-
2008
- 2008-12-16 WO PCT/JP2008/072824 patent/WO2009081791A1/ja active Application Filing
- 2008-12-16 JP JP2009547049A patent/JPWO2009081791A1/ja active Pending
- 2008-12-16 US US12/809,256 patent/US20110043869A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS62165284A (ja) * | 1986-01-17 | 1987-07-21 | Hitachi Ltd | 文字列抽出方式 |
JP2004030696A (ja) * | 1997-12-19 | 2004-01-29 | Fujitsu Ltd | 文字列抽出装置及びパターン抽出装置 |
JP2000306041A (ja) * | 1999-04-23 | 2000-11-02 | Ricoh Co Ltd | 文字サイズ推定方法および記録媒体 |
JP2006155380A (ja) * | 2004-11-30 | 2006-06-15 | Canon Inc | 画像処理装置、その方法およびその制御方法 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012090264A (ja) * | 2010-10-19 | 2012-05-10 | Palo Alto Research Center Inc | 二次元ビジュアルフィンガープリントを用いるプレゼンテーション及びリッチドキュメントコンテンツの混合コレクションにおける類似コンテンツの発見方法 |
JP2013190973A (ja) * | 2012-03-13 | 2013-09-26 | Nec Corp | 文書内の図情報を利用した類似文書の検索システム及び方法 |
US9378248B2 (en) | 2012-03-13 | 2016-06-28 | Nec Corporation | Retrieval apparatus, retrieval method, and computer-readable recording medium |
JP2019505038A (ja) * | 2015-12-31 | 2019-02-21 | 北京金山▲辧▼公▲軟▼件股▲ふん▼有限公司Beijing Kingsoft Office Software,Inc. | スライドを識別するための方法及び装置 |
US10698943B2 (en) | 2015-12-31 | 2020-06-30 | Beijing Kingsoft Office Software, Inc. | Method and apparatus for recognizing slide |
JP2020052961A (ja) * | 2018-09-28 | 2020-04-02 | キヤノン株式会社 | コンテンツ提供システム、コンテンツ提供方法、情報処理装置、及びプログラム |
JP7134814B2 (ja) | 2018-09-28 | 2022-09-12 | キヤノン株式会社 | システム、ページデータ出力方法、及びプログラム |
JP7435118B2 (ja) | 2020-03-24 | 2024-02-21 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及びプログラム |
JP2022026017A (ja) * | 2020-07-30 | 2022-02-10 | 楽天グループ株式会社 | 情報処理装置、情報処理方法およびプログラム |
Also Published As
Publication number | Publication date |
---|---|
JPWO2009081791A1 (ja) | 2011-05-06 |
US20110043869A1 (en) | 2011-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009081791A1 (ja) | 情報処理システム、その方法及びプログラム | |
US7840891B1 (en) | Method and system for content extraction from forms | |
US8442324B2 (en) | Method and system for displaying image based on text in image | |
US9430716B2 (en) | Image processing method and image processing system | |
US20140176564A1 (en) | Chinese Character Constructing Method and Device, Character Constructing Method and Device, and Font Library Building Method | |
US20150199567A1 (en) | Document classification assisting apparatus, method and program | |
US20190026355A1 (en) | Information processing device and information processing method | |
US20140184610A1 (en) | Shaping device and shaping method | |
KR101549792B1 (ko) | 문서 자동 작성 장치 및 방법 | |
JP2018081674A (ja) | 手書きテキスト画像に対する行及び単語切り出し方法 | |
US20170132484A1 (en) | Two Step Mathematical Expression Search | |
US11928418B2 (en) | Text style and emphasis suggestions | |
CN111222314B (zh) | 版式文档的比对方法、装置、设备及存储介质 | |
JP6736224B2 (ja) | 文章解析装置及び文章解析プログラム | |
JP2006318219A (ja) | 類似スライド検索プログラム及び検索方法 | |
US11138257B2 (en) | Object search in digital images | |
Tomovic et al. | Aligning document layouts extracted with different OCR engines with clustering approach | |
JP6441142B2 (ja) | 検索装置、方法及びプログラム | |
KR20210033730A (ko) | Pdf 문서에서 텍스트 라인 정보를 기초로 단락의 구분선을 표시하는 전자 장치 및 그 동작 방법 | |
Yamashita et al. | A document recognition system and its applications | |
JP7410532B2 (ja) | 文字判定装置及び文字判定プログラム | |
KR20200110880A (ko) | 스타일 속성에 기반하여 문서에 대한 중요 키워드를 선정하는 전자 장치 및 그 동작 방법 | |
WO2009087815A1 (ja) | 類似文書検索システム、類似文書検索方法および記録媒体 | |
US20240086455A1 (en) | Image search apparatus, image search method, and non-transitory storage medium | |
JPH09319764A (ja) | キーワード生成装置及び文書検索装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08864665 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
ENP | Entry into the national phase |
Ref document number: 2009547049 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12809256 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08864665 Country of ref document: EP Kind code of ref document: A1 |