SG192807A1 - Information processing device - Google Patents

Information processing device Download PDF

Info

Publication number
SG192807A1
SG192807A1 SG2013062302A SG2013062302A SG192807A1 SG 192807 A1 SG192807 A1 SG 192807A1 SG 2013062302 A SG2013062302 A SG 2013062302A SG 2013062302 A SG2013062302 A SG 2013062302A SG 192807 A1 SG192807 A1 SG 192807A1
Authority
SG
Singapore
Prior art keywords
recognition
recognition result
information
engines
integration
Prior art date
Application number
SG2013062302A
Inventor
Shinichiro Kamei
Nobuhisa Shiraishi
Takeshi Arikuma
Original Assignee
Nec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nec Corp filed Critical Nec Corp
Publication of SG192807A1 publication Critical patent/SG192807A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Abstract

5An information processing device 200 of the present invention includes: a recognition result acquiring means 201 for acquiring respective recognition result information outputted by a plurality of recognition engines 211, 212 and 213 executing different recognition processes on recognition target data; and an integration recognition result outputting means 202 for outputting10 a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines. The recognition result acquiring means 201 is configured to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines. The integration recognition result outputting means 202 is configured to integrate the respective recognition15 result information based on the respective recognition result information, and output as the new recognition result. Figure 14

Description

DESCRIPTION
TITLE: INFORMATION PROCESSING DEVICE
TECHNICAL FIELD
[0001]
The present invention relates to an information processing device, more specifically, relates to an information processing device that further outputs a result through the use of the results of recognition by a plurality of recognition engines.
BACKGROUND ART
[0002]
In accordance with development of information processing techniques, various recognition engines executing recognition processes have been developed. For example, there exist various recognition engines, such as a recognition engine that identifies a person from still image data, a recognition engine that generates location information tracing the flow of a person from moving image data, and a recognition engine that generates text data from speech data. As one example, Patent Document 1 discloses a robot that is equipped with recognizers for image recognition and speech recognition.
[0003]
Patent Document 1: Japanese Unexamined Patent Application Publication No. 2004-283959
Although various recognition engines have been developed as described above, the respective recognition engines are incorporated in specific systems depending on individual applications, and therefore, it is difficult to use these recognition engines for other purposes.
Even if it is intended to reuse the recognition engines, development of a new system capable of using the recognition engines needs huge cost because output formats of the recognition engines are different from each other. Thus, there is a problem that it is impossible to flexibly respond to an application’s request for use of output results acquired from a plurality of recognition engines and development of such a system needs huge cost.
SUMMARY
[0005]
Accordingly, an object of the present invention is to solve the abovementioned problem; it is difficult to reuse output results acquired from a plurality of recognition engines at low cost.
[0006]
In order to achieve the abovementioned object, an information processing device of an exemplary embodiment of the present invention includes: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, wherein:
the recognition result acquiring means is configured to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines; and the integration recognition result outputting means is configured to integrate the respective recognition result information based on the respective recognition result information and output as the new recognition result. {0007]
Further, a computer program of another exemplary embodiment of the present invention is a computer program including instructions for causing an information processing device to realize: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, the computer program also including instructions for: causing the recognition result acquiring means to acquire the recognition result information in a data format common to the plurality of recognition engines, from each of the plurality of recognition engines; and causing the integration recognition result outputting means to integrate the respective recognition result information based on the respective recognition result information and output as the new recognition result.
[0008]
Further, an information processing method of another exemplary embodiment of the present invention includes: acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data, the respective recognition result information being in a data format common to the plurality of recognition engines; and integrating the respective recognition result information based on the respective recognition result information acquired from the plurality of recognition engines and outputting as a new recognition result,
[0009]
With the configurations as described above, the present invention can provide an information processing device capable of reusing outputs results acquired from a plurality of recognition engines at low cost.
BRIEF DESCRIPTION OF DRAWINGS
[00101
Fig. 1 is a block diagram showing the configuration of an intermediate-layer device in the present invention;
Fig. 2 is a diagram showing an example of the structure of a recognition result registered in an intermediate-layer device in a first exemplary embodiment of the present invention;
Fig. 3 is a diagram showing an example of the structure of a recognition result registered in the intermediate-layer device in the first exemplary embodiment of the present invention; 5 Fig. 4 is a diagram showing an example of the structure of a recognition result registered in the intermediate-layer device in the first exemplary embodiment of the present mvention;
Fig. 5 is a flowchart showing the operation of the intermediate-layer device in the first exemplary embodiment of the present invention;
Fig. 6 1s a diagram showing an example of a recognition result acquired from a recognition engine by the intermediate-layer device in the first exemplary embodiment of the present invention;
Fig. 7 1s a diagram showing an example of a recognition result acquired from the recognition engine by the intermediate-layer device in the first exemplary embodiment of the present invention;
Fig. 8 is a diagram showing a state in which the intermediate-layer device is integrating recognition results acquired from recognition engines in the first exemplary embodiment of the present invention;
Fig. 9 is a diagram showing a state in which the intermediate-layer device has integrated recognition results acquired from the recognition engines in the first exemplary embodiment of the present invention;
Fig. 10 is a diagram showing an example of the structure of a recognition result registered in an intermediate-layer device in the second exemplary embodiment of the present invention;
Fig. 11 is a diagram showing an example of the structure of a recognition result registered in the intermediate-layer device in the second exemplary embodiment of the present invention;
Fig. 12 is a diagram showing a state in which the immtermediate-layer device is integrating recognition results acquired from recognition engines in the second exemplary embodiment of the present invention;
Fig. 13 is a diagram showing a state in which the intermediate-layer device has integrated recognition results acquired from the recognition engines in the second exemplary embodiment of the present invention; and
Fig. 14 is a block diagram showing the configuration of an information processing device in Supplementary Note 1 of the present invention.
EXEMPLARY EMBODIMENTS
[0011] <First Exemplary Embodiment>
A first exemplary embodiment of the present invention will be described with reference to Figs. 1 to 9. Figs. 1 to 4 are diagrams for describing the configuration of an intermediate-layer device in this exemplary embodiment, and Figs. 5 to 9 are diagrams for describing the operation thereof.
y [Configuration]
As shown in Fig. 1, an information processing system in the first exemplary embodiment includes a plurality of recognition engines 21 to 23, an intermediate-layer device 10 connected with the recognition engines 21 to 23, and an application unit 30 implemented in an information processing terminal connected with the intermediate-layer device 10.
[0013]
The recognition engines 21 to 23 are information processing devices that execute different recognition processes, respectively, on recognition target data. The recognition engines 21 to 23 are, for example: a recognition engine that identifies a person from recognition target data like still image data, moving image data and speech data; a recognition engine that generates location information tracing the flow of a person from moving image data; and a recognition engine that generates text data from speech data of recognition target data. The recognition engines in this exemplary embodiment are a person recognition engine (denoted by reference numeral 21 hereinafter) that identifies a previously set person from among moving image data, and a flow recognition engine (denoted by reference numeral 22 hereinafter) that recognizes a flow representing the trajectory of an object from among moving image data. The recognition engines are not limited to those executing the recognition processes described above, and the number thereof is not limited to two as described above or three as shown in Fig. 1. For example, the abovementioned person recognition engine may be an object recognition engine that identifies not only a person but a previously set object.
[0014]
To be specific, the person recognition engine 21 detects previously registered feature value data of a “specific person” that the person recognition engine 21 is requested to execute a recognition process, from among frame images at respective dates and times of moving image data. Upon detecting the feature value data, the person recognition engine 21 recognizes that the “specific person” has been in a set-up position of a camera having acquired the moving image data at the detection data and time, and outputs person identification information (a person 1D) set for the recognized “specific person” and location information showing the date/time and position that the data has been detected, as recognition result information. The person recognition engine 21 in this exemplary embodiment outputs the recognition result information described above to the intermediate-layer device 10 in accordance with a data format that is previously registered in the intermediate-layer device 10 as described later, and the data format will be described later.
[0015]
Further, to be specific, the flow recognition engine 22 detects a moving object in moving image data, and recognizes a time-series trajectory of the object. Then, the flow recognition engine 22 outputs object identification information (an object 1D) identifying the moving object, trajectory identification information (a trajectory ID) identifying the detected whole trajectory, and flow information including dates/times and positions (coordinates) that the object has been detected, as recognition result information. The flow recognition engine 22 in this exemplary embodiment outputs the recognition result information described above to the intermediate-layer device 10 in accordance with a data format that is previously registered in the intermediate-layer device 10 as described later, and the data format will be described later.
Next, the application unit 30 will be described. The application unit 30 is implemented in an information processing terminal connected to the intermediate-layer device 10, and has a function of making a request to the intermediate-layer device 10 for the results of recognition by the recognition engines 21 to 23. In this exemplary embodiment, the application unit 30 specifically requests for a new recognition result obtained by reusing and integrating the result of recognition by the object recognition engine 21 and the result of recognition by the flow recognition engine 22. To be specific, in an example described in this exemplary embodiment, the application 30 makes a request to the intermediate-layer device 10 for the “flow of a specific person.”
[0017]
Next, the intermediate-layer device 10 will be described. The intermediate-layer device 10 is an information processing device including an arithmetic device and a storage device. The intermediate-layer device 10 includes a recognition-engine output acquiring unit 11, an instance/class correspondence judging unit 12, an output instance holding unit 13, and a class structure registering unit 15, which are built by incorporation of programs into the arithmetic device, as shown in Fig. 1. The intermediate-layer device 10 also includes a class structure holding unit 14 and a class ID holding unit 16 in the storage device. The respective configurations will be described in detail below.
[0018]
First, the recognition-engine output acquiring unit 11 (a recognition result acquiring means) acquires recognition result information generated in recognition processes by the person recognition engine 21 and the flow recognition engine 22 from the respective engines as described above. Then, the instance/class correspondence judging unit 12 (the recognition result acquiring means) holds each of the acquired recognition result information in a previously defined class structure showing a data format.
[0019]
A class structure, which is a data format of recognition result information corresponding to the person recognition engine 21 and which is recognition result definition information defined by, for example, a graph structure or a tree structure, is stored into the class structure holding unit 14 (a definition information storing means) and the class ID holding unit 16 from the class structure registering unit 15, and defined. To be specific, as shown in Fig. 2, a class structure C1 corresponding to the person recognition engine 21 is defined by a graph structure such that location information C12 represented by coordinates {an X coordinate value, a Y coordinate value) of a position where an image (a frame) with a specific person detected has been captured and also represented by a date and time (detection time) when the image has been captured is placed below person identification information (a person ID) C11 identifying the detected specific person. Thus, the instance/class correspondence judging unit 12 applies values included in recognition result information actually acquired from the person recognition engine 21 to the person identification information (person ID) and the position and so on (X coordinate value, Y coordinate value, detection time) of the class structure C1 corresponding to the person recognition engine 21, and holds as shown in Fig. 6.
[0020]
Further, as shown in Fig. 3, a class structure C2, which is a data format of recognition result information corresponding to the flow recognition engine 22 and which is recognition result definition information defined by, for example, a graph structure or a tree structure, is defined by a graph structure such that trajectory identification information (a trajectory ID) C22 identifying a whole trajectory of a detected moving object is placed below object identification information (an object ID) C21 identifying the object, and flow information C23 represented by dates and times (detection time) forming the trajectory of the detected object and coordinates (X coordinate values, Y coordinate values) of positions at these times is placed below C22. Thus, the instance/class correspondence judging unit 12 applies values included in recognition result information actually acquired from the flow recognition engine 22 to the object identification information (object ID) and the position and so on (X coordinate value, Y coordinate value, detection time) of the class structure C2 corresponding to the flow recognition engine 22, and holds as shown in Fig. 7.
[0021]
The output instance holding unit 13 (an integration recognition result outputting means) included in the intermediate-layer device 10 integrates the recognition result information acquired from the person recognition engine 21 and the recognition result information acquired from the flow recognition engine 22 that are applied to the previously defined class structures C1 and C2 and held as described above, and holds as a new recognition result. To be specific, the output instance holding unit 13 integrates the recognition result information in accordance with a class structure C3. The class structure C3 is previously stored in the class structure holding unit 14 (a definition information storing means) and the class ID holding unit 16, and is integration definition information showing an integration format that is a data format representing a condition of integration of the recognition result information and a new recognition result after integration, by a graph structure or a tree structure. 10022]
As shown in Fig. 4, a class structure representing a data format of a condition of integration of the recognition result information and a new recognition result after integration represents the structure C3 such that the trajectory identification information (trajectory ID) C22 and the flow information C23 are related and integrated with the person identification information (person ID) C11, which is part of the recognition result C1 acquired from the person recognition engine 21 and identifies a detected specific person.
[0023]
In this case, a condition of integration of the recognition result information as described above is that the person identification information (person ID) C11 outputted from the person recognition engine 21 coincides with part of the object identification information (object ID) C21 outputted from the flow recognition engine 21. For judgment of this, the output instance holding unit 13 examines whether the recognition result information detected by the respective engines 21 and 22 include information judged to correspond to each other in accordance with a predetermined standard. To be specific, the output instance holding unit 13 examines whether the flow information C23 outputted by the flow recognition engine 22 includes information that coincide with the location information C12, namely, a date/time and position of detection of a specific person outputted by the object recognition engine 21. In a case that the location information C12 and part of the flow information C23 coincide with each other, the output instance holding unit 13 relates the coincident flow information C23 and the trajectory identification information C22 above C23 to the person identification information (person ID)
C11 corresponding to the coincident location information C12, and holds a new recognition result as shown in Fig. 9.
[0024]
However, a condition of integration of the recognition result information is not limited to that the recognition result information detected by the respective engines 21 and 22 include information that coincide with each other as described above. For example, in a case that the flow information C23 outputted by the flow recognition engine 22 includes information “close to” the location information C12, namely, the date/time and position of detection of a specific person outputted by the person recognition engine 21, the output instance holding unit 13 may judge that the flow information C23 and the location information C12 correspond to each other and integrate by relating the flow information C23 and the trajectory identification information
C22 above C23 to the person identification information C11 corresponding to the location information C12. In this case, “close to” the date/time and position of detection of a specific person shall include, for example, a date/time and position within an allowance previously set for the detection date/time and position.
[0025]
The data format of the recognition result information outputted by the respective recognition engines 21 and 22 and the format of integration of the recognition result information by the output instance holding unit 13 are defined by a graph structure or a tree structure in the above case shown as an example, but may be defined by another data format. 10026]
Farther, the output instance holding unit 13 outputs a new recognition result generated and held as described above to the application unit 30.
[0027] [Operation]
Next, the operation of the information processing system described above, specifically, the operation of the intermediate-layer device 10 will be described with reference to a flowchart of Fig. 5 and data transition diagrams of Figs. 6 to 9.
[06028]
First, the intermediate-layer device 10 accepts a request for a recognition result from the application unit 30 (step S1). Herein, it is assumed that the intermediate-layer device 10 accepts arequest for a recognition result of the “flow of a specific person.”
[0029]
Subsequently, the intermediate-layer device 10 acquires recognition result information from the respective recognition engines used for outpuiting the recognition result requested by the application unit 30. In this exemplary embodiment, the intermediate-layer device 10 acquires recognition result information that the person recognition engine 21 and the flow recognition engine 22 have generated in recognition processes, from the respective engines (step S2). Then, the intermediate-layer device 10 holds the respective recognition result information having been acquired, in accordance with previously defined class structures. To be specific, the intermediate-layer device 10 applies the recognition result information acquired from the person recognition engine 21 to the class structure C1 corresponding to the person recognition engine shown in Fig. 2, and holds as shown in Fig. 6. Moreover, the intermediate-layer device 10 applies the recognition result information acquired from the flow recognition engine 21 to the class structure C2 corresponding to the flow recognition engine shown in Fig. 2, and holds as shown in Fig. 7.
[0030]
Subsequently, the intermediate-layer device 10 examines the identity of person identification information (person ID) C11’ and object identification information (object ID)
C21’ included by the respective recognition result information acquired from the person recognition engine 21 and the flow recognition engine 22 applied and held in respective class structures C1” and C2’ as shown in Figs. 6 and 7 (step S3, see arrow Y2 in Fig. 8). To be specific, the intermediate-layer device 10 examines whether flow information C23’ outputted from the flow recognition engine 22 includes information coincident with location information C12’ outputted from the person recognition engine 21.
[0031]
In a case that the flow information C23” includes information coincident with the location information C12’, the intermediate-layer device 10 judges that the person identification information (person ID) C11’ and the object identification information (object ID) C21’ that include the coincident information are identical to each other. Then, as shown in Fig. 9, the intermediate-layer device 10 relates and integrates the person identification information (person
ID) C11’ and trajectory identification information C22’ placed below the object identification information (object 1D) C21’ judged identical to C11” and the flow information C23’ placed below C22, and holds as a new recognition result (step S4).
[0032]
After that, the intermediate-layer device 10 outputs the new recognition result generated and held as described above to the application unit 30 (step S5).
[0033]
In the above description, the recognition results acquired from the engines 21 and 22 are mtegrated when the location information C12’ outputted from the person recognition engine 21 coincides with part of the flow information C23’ outputted from the flow recognition engine 22, but a condition of integration is not limited to such a condition. For example, the recognition result information may be integrated as described above when the person identification information (person ID) C11” included in the recognition result information acquired from the person recognition engine 21 coincides with the object identification information (object ID)
C21’ included in the recognition result information acquired from the flow recognition engine 22.
Alternatively, the recognition result information may be integrated when the flow information
C23’ outputted from the flow recognition engine 22 includes information “close to” the location information C12” outputted from the person recognition engine 21.
[0034]
In the above description, the process is executed in the following order: accept a request for a recognition result from the application unit 30 (step S1); acquire recognition results from the respective recognition engines (step S2); and synthesize the recognition results (step S4).
However, execution of the process is not limited by such a procedure. For example, the process may be executed in the following order: previously execute steps S2, S3 and S4 shown in Fig. 5, that is, previously acquire recognition results from the respective recognition engines and synthesize the recognition results, and accumulate the results of synthesis in the intermediate-layer device 10; and when accepting a request for a recognition result from the application unit 30, extract a synthesis result corresponding to the request from among the accumulated data and output to the application unit 30.
[0035]
Thus, providing the intermediate-layer device 10 in this exemplary embodiment to hold recognition result information in a common data format to a plurality of recognition engines facilitates reuse of recognition results acquired from the respective recognition engines for other purposes. Therefore, it becomes easy to develop a recognition engine and an application that requests for a recognition result, and it is possible to increase the versatility of a recognition engine at low costs.
[0036] <Second Exemplary Embodiment>
Next, a second exemplary embodiment of the present invention will be described with reference to Figs. 1 and 5 and Figs. 10 to 13. Fig. 1 is a diagram for describing the configuration of an intermediate-layer device in this exemplary embodiment. Figs. 5 and Figs. 10 to 13 are diagrams for describing the operation thereof.
[0037] [Configuration]
An information processing system in the second exemplary embodiment includes, as in the first exemplary embodiment, the plurality of recognition engines 21 to 23, the intermediate-layer device 10 connected with the recognition engines 21 to 23, and the application unit 30 implemented in the information processing terminal connected with the intermediate-layer device 10 as shown in Fig. 1.
[0038]
Both the recognition engines in this exemplary embodiment are person recognition engines that identify a person from recognition target data, but recognition methods thereof are different from each other. For example, a first personal recognition engine (denoted by reference numeral 21 hereinafter) recognizes a specific person based on feature data of faces of persons previously set from moving image data. Moreover, a second person recognition engine {denoted by reference numeral 22 hereinafter) recognizes a specific person based on feature data of voices of persons previously set from speech data. Moreover, a third person recognition engine (denoted by reference numeral 23 hereinafter) recognizes a specific person based on the content of utterance (keyword) previously set from speech data. However, the recognition engines are not limited to those executing the processes and the number thereof is not limited to the abovementioned number. For example, the person recognition engine may be an object recognition engine that identifies not only a person but also a preset object.
[0039]
The respective person recognition engines 21, 22 and 23 in this exemplary embodiment output recognition result information to the intermediate-layer device 10 in a common data format that is previously registered in the intermediate-layer device 10. To be specific, each of the person recognition engines 21, 22 and 23 outputs, as recognition result information, person candidate information including information identifying one or more candidate persons based on the results of recognition by the respective recognition methods.
[0040]
Correspondingly, the intermediate-layer device 10 stores a class structure C101, which defines a data format of the respective recognition result information corresponding to the person recognition engines 21, 22 and 23 and which is recognition result definition information of, for example, a graph structure or a tree structure, in the class structure holding unit 14 and so on as shown in Fig. 10. To be specific, the class structure C101 corresponding to the respective person recognition engines 21, 22 and 23 is defined by a graph structure in which person identification information C111, C112 and C113 identifying one or more persons of recognition candidates are placed below engine identification information C110 identifying the person recognition engine having executed a recognition process, as shown in Fig. 10. Herein, the class structure C101 is defined so that person identification information of three persons who are the most likely to become candidates as a result of the recognition process are placed and stored in order from the top. Consequently, the instance/class correspondence judging unit 12 of the intermediate-layer device 10 holds by applying the recognition result information that the recognition-engine output acquiring unit 11 has acquired from the person recognition engines 21, 22 and 23 to the class structure C101 for each of the person recognition engines 21, 22 and 23, as shown by reference numerals C101-1, C101-2 and C101-3 in Fig. 12.
[0041]
Then, the output instance holding unit 13 included by the intermediate-layer device 10 integrates the recognition result information having been acquired from the respective person recognition engines 21, 22 and 23, applied to the previously defined class structure C101 and held as described above, and holds as a new recognition result. To be specific, the output instance holding unit 13 integrates the recognition result information in accordance with a class structure C102, which is previously defined in the class structure holding unit 14 and so on, and which is integration definition information defining, in a graph structure or a tree structure, an integration format that is a data format representing a condition of integration of the recognition result information and representing a new recognition result after integration.
[0042]
As shown in Fig. 11, a class structure representing a data format of a condition of integration of the recognition result information and a new recognition result after integration represents a structure of integration so as to identify one person identification information {person ID} C121 from among person identification information of candidate persons of the recognition result information C120 acquired from the respective person recognition engines 21, 22 and 23.
[0043]
In this case, a condition of integration of the recognition result information as described above is that there exists coincident person identification information among person identification information (person ID) outputted from the different person recognition engines 21, 22 and 23. For judgment, the output instance holding unit 13 examines whether the respective recognition result information outputted by the engines 21 and 22 include person identification information judged to correspond to each other in accordance with a predetermined standard. To be specific, the output instance holding unit 13 examines whether all of the three recognition result information outputted from the three person recognition engines 21, 22 and 23 include identical person identification information, identifies the identical person identification information (person ID), and holds this person identification information as a new identification result.
[0044]
However, the condition of integration of the recognition result information is not limited to that the recognition result information detected by the respective engines 21, 22 and 23 include information that is coincident with each other. For example, similarity of attribute information, which is feature value data extracted from data of persons corresponding to the respective person identification information (person IDs) having been detected is firstly judged in accordance with a predetermined standard. Then, in a case that the degree of the similarity is within a preset range in accordance with the standard, the persons of the respective person identification information are judged to be identical.
[0045]
The data format of the recognition result information acquired from the respective recognition engines 21, 22 and 23 and the format of integration of the recognition result information by the output instance holding unit 13 are defined by a graph structure or a tree structure in the above case described as an example, but may be defined by another data format.
[0046]
Further, the output instance holding unit 13 outputs a new recognition result generated and held as described above to the application unit 30. 10047] [Operation]
Next, the operation of the information processing system described above, specifically, the operation of the iniermediate-layer device 10 will be described with reference to the flowchart of Fig. 5 and the data transition diagrams of Figs. 10 to 13.
[0048]
First, the intermediate-layer device 10 accepts a request for a recognition result from the application unit 30 (step S1). For example, it is assumed that the intermediate-layer device 10 accepts a request for a recognition result of “identification of a person.”
[0049]
Subsequently, the intermediate-layer device 10 acquires recognition result information from the respective recognition engines used for outputting the recognition result requested by the application unit 30 (step S2). Then, the intermediate-layer device 10 holds the respective recognition result information having been acquired in accordance with a previously defined class structure.
[0050]
To be specific, in this exemplary embodiment, three person identification information as candidates acquired from the first person recognition engine 21 recognizing a person based on feature data of the face of a person from moving image data are applied to the class structure and held as shown by reference numeral C101-1 in Fig. 12. Moreover, three person identification information as candidates acquired from the second person recognition engine 22 recognizing a person based on feature data of the voice of a person from speech data are applied to the class structure and held as shown by reference numeral C101-2 in Fig. 12. Furthermore, three person identification information as candidates acquired from the third person recognition engine 23 recognizing a person based on the content of utterance (keyword) from speech data are applied to the class structure and held as shown by reference numeral C101-3 in Fig. 12.
Subsequently, the intermediate-layer device 10 examines person identification information corresponding to each other, herein, person identification information common to each other, among the person identification information (person IDs) C111-1 to C113-1, C111-2 to C113-2 and C111-3 to C113-3 that are candidates acquired from the respective person recognition engines 21, 22 and 23 and that are applied and held in the defined class structure as shown in Fig. 12 (step 53). In the example shown in Fig. 12, because the person identification formation of “Person-A” is common as shown by reference numerals C111-1, C112-2 and
C111-3 among all the recognition result information, the intermediate-layer device 10 integrates the recognition results into one person identification information C121’ represented by “Person-A” as shown in Fig. 13 (step S4). Alternatively, in a case that, as a result of judgment of similarity of attribute information attached to the respective person identification information, the degree of the similarity is higher than a standard, the intermediate-layer device 10 may judge a person in respective person identification information is an identical person, and integrate recognition results.
[0052]
In the above description, the process is executed in the following order: accept a request for a recognition result from the application unit 30 (step S1); acquire recognition results from the respective recognition engines (step 52); and synthesize the recognition results (step S4).
However, execution of the process is not limited by such a procedure. For example, the intermediate-layer device 10 may execute the process in the following order: previously acquire recognition results from the respective recognition engines to synthesize the recognition results, and accumulate the results of synthesis into the intermediate-layer device 10; and then, upon acceptance of a request for a recognition result from the application unit 30, extract a synthesis result corresponding to the request from the accumulated data and output to the application unit 30.
[0053]
After that, the intermediate-layer device 10 outputs a new recognition result generated and held as described above to the application unit 30 (step 85).
[0054]
Thus, providing the intermediate-layer device 10 in this exemplary embodiment and holding recognition result information in a data format common to a plurality of recognition engines facilitates reuse of recognition results acquired from the respective recognition engines for other purposes. Therefore, it becomes easy to develop a recognition engine and an application that requests for a recognition result, and it is possible to increase the versatility of a recognition engine at low costs.
[0055]
In the above description, the intermediate-layer device 10 integrates recognition results so that a person commonly included in all the recognition results is identified from among candidate persons outputted from the person recognition engines 21, 22 and 23, but the condition of integration is not limited thereto. The intermediate-layer device 10 may integrate recognition results by another method, for example, by integrating recognition results so as to identify a person included in half or more of the recognition engines. Moreover, in the example described above, each of the person recognition engines 21, 22 and 23 outputs three candidate persons as a recognition result, but may output only one candidate person as a recognition result, or may output another number of candidate persons as a recognition result.
[0056] <Supplementary Notes™>
The whole or part of the exemplary embodiments disclosed above can be described as the following supplementary notes. Below, the outline of the configuration of the information processing device according to the present invention will be described with reference to Fig. 14.
However, the present invention is not limited to the following configurations.
[0057] (Supplementary Note 1)
An information processing device 200 comprising: a recognition result acquiring means 201 for acquiring respective recognition result information outputted by a plurality of recognition engines 211, 212 and 213 executing different recognition processes on recognition target data; and an integration recognition result outputting means 202 for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, wherein: the recognition result acquiring means 201 is configured to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines; and the integration recognition result outputting means 202 is configured to integrate the respective recognition result information based on the respective recognition result information,
and output as the new recognition result.
[0058] (Supplementary Note 2)
The information processing device according to Supplementary Note 1, wherein the integration recognition result outputting means is configured to, when the respective recognition result information include information that correspond to each other in accordance with a predetermined standard, integrate the respective recognition result information and output as the new recognition result. {0059] (Supplementary Note 3)
The information processing device according to Supplementary Note 1 or 2, comprising a definition information storing means for storing: recognition result definition information that defines a data format of the recognition result information acquired by the recognition result acquiring means from each of the plurality of recognition engines; and integration definition information that defines an integration format in which the integration recognition result outputting means integrates the respective recognition result information, wherein: the recognition result acquiring means 1s configured to acquire the respective recognition result information in the data format defined by the recognition result definition information; and the integration recognition result outputting means is configured to integrate the respective recognition result information in the integration format defined by the integration definition information, and output the new recognition resuit.
[0060] (Supplementary Note 4)
The information processing device according to any of Supplementary Notes 1 to 3, wherein the recognition result information acquired by the recognition result acquiring means from each of the plurality of recognition engines is in a graph-structured or tree-structured data format.
[0061] (Supplementary Note 5)
The information processing device according to Supplementary Note 2, wherein: the recognition result acquiring means is configured to acquire the recognition result information from an object recognition engine that recognizes an object from the recognition target data and a flow recognition engine that recognizes a flow representing a trajectory of a predetermined object from the recognition target data, respectively; and the integration recognition result outputting means is configured to integrate by relating flow information of a predetermined object with object information and output as the new recognition result, the flow information being the recognition result information acquired from the flow recognition engine, and the object information being the recognition result information acquired from the object recognition engine.
[0062] (Supplementary Note 6)
The information processing device according to Supplementary Note 5, wherein: the recognition result acquiring means is configured to: acquire the recognition result information including location information representing a position of an object recognized from the recognition target data by the object recognition engine and also representing a date and time when the object is recognized, from the object recognition engine; and also acquire flow information of a predetermined object recognized from the recognition target data by the flow recognition engine, from the flow recognition engine; and the integration recognition result acquiring means is configured to, when the flow information acquired from the flow recognition engine includes information that corresponds to the location information acquired from the object recognition engine in accordance with a predetermined standard, integrate by relating the flow information that is the recognition result acquired from the flow recognition engine with object information identified by the recognition result information that is acquired from the object recognition engine and that includes the mformation corresponding to the location information, and output as the new recognition result.
[0063] (Supplementary Note 7)
The formation processing device according to Supplementary Note 2, wherein: the recognition result acquiring means is configured to acquire the recognition result information from a plurality of object recognition engines recognizing an object from the recognition target data in different recognition processes, respectively; and the integration recognition result outputting means is configured to integrate by identifying one object information based on respective object information that are the respective recognition result information acquired from the plurality of object recognition engines, and output the new recognition result.
(Supplementary Note 8)
The information processing device according to Supplementary Note 7, wherein: the recognition result acquiring means is configured to acquire the recognition result information including object candidate information representing a candidate of an object recognized from the recognition target data by each of the plurality of object recognition engines, from each of the plurality of object recognition engines; and the integration recognition result acquiring means is configured to identify object information that corresponds to each other in accordance with a predetermined standard in the respective object candidate information acquired from the plurality of object recognition engines, and output as the new recognition result,
[0065] (Supplementary Note 9}
A computer program comprising instructions for causing an information processing device to realize: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, the computer program also comprising instructions for:
causing the recognition result acquiring means to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines; and causing the integration recognition result outputting means to integrate the respective recognition result information based on the respective recognition result information and output as the new recognition result.
[0066] (Supplementary Note 10)
The computer program according to Supplementary Note 9, comprising instructions for causing the integration recognition result outputting means to, when the respective recognition result information include information that correspond to each other in accordance with a predetermined standard, integrate the respective recognition result information and output as the new recognition result.
[0067] (Supplementary Note 11)
An information processing method comprising: acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data, the respective recognition result information being in a data format common to the plurality of recognition engines; and integrating the respective recognition result information based on the respective recognition result information acquired from the plurality of recognition engines, and outputting as a new recognition result,
[0068] (Supplementary Note 12)
The information processing method according to Supplementary Note 11, comprising: integrating the respective recognition result information and outputting as the new recognition result, when the respective recognition result information include information corresponding to each other in accordance with a predetermined standard.
[0069] (Supplementary Note 13)
An information processing device comprising: a recognition result acguiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, wherein: the recognition result acquiring means is configured to: acquire the recognition result information including location information that represents a position of an object recognized from the recognition target data by an object recognition engine recognizing an object from the recognition target data and also represents a date and time when the object is recognized, the recognition result information being in a data format common to the plurality of recognition engines, from the object recognition engine; and also acquire the recognition result information including flow information of a predetermined object recognized from the recognition target data by a flow recognition engine recognizing a flow representing a trajectory of a predetermined object from the recognition target data, the recognition result information being in a data format common to the plurality of recognition engines, from the flow recognition engine; and the integration recognition result acquiring means is configured to, when the flow information acquired from the flow recognition engine includes information that corresponds to the location information acquired from the object recognition engine in accordance with a predetermined standard, integrate by relating the flow information that is the recognition result acquired from the flow recognition engine with object information identified by the recognition result information that is acquired from the object recognition engine and that includes the information corresponding to the location information, and output as the new recognition result.
[0070] (Supplementary Note 14)
An information processing device comprising: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, wherein: the recognition result acquiring means is configured to acquire the recognition result information including object candidate information representing a candidate of an object recognized from the recognition target data by each of a plurality of object recognition engines recognizing an object by different recognition processes from the recognition target data, the recognition result information being in a data format common to the plurality of recognition engines, from each of the plurality of object recognition engines; and the integration recognition result outputting means is configured to integrate by identifying one object information that corresponds to each other in accordance with a predetermined standard in the respective object candidate information acquired from the plurality of object recognition engines, and output as the new recognition result.
[6071]
In each of the exemplary embodiments, the computer program is stored in the storage device or recorded in a computer-readable recording medium. For example, the recording medium is a portable medium such as a flexible disk, an optical disk, a magneto-optical disk and a semiconductor memory.
[0072]
The present invention has been described above with reference to the exemplary embodiments, but the present invention is not limited to the exemplary embodiments described above. The configurations and details of the present invention can be modified in various manners that can be understood by those skilled in the art within the scope of the present invention.
[0073]
The present invention is based upon and claims the benefit of priority from Japanese patent application No. 2011-031613, filed on February 17, 2011, the disclosure of which is incorporated herein in its entirety by reference.
DESCRIPTION OF REFERENCE NUMERALS
[6074] intermediate-layer device 11 recognition-engine output acquiring unit 12 instance/class correspondence judging unit 13 output instance holding unit 14 class structure holding unit 10 15 class structure registering unit 16 class ID holding unit 21,22,23 recognition engine 30 application unit 200 information processing device 201 recognition result acquiring unit 202 integration recognition result outputting means 211,212,213 recognition engine

Claims (12)

  1. I. An information processing device comprising: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, wherein: the recognition result acquiring means is configured to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines; and the integration recognition result outputting means 1s configured to integrate the respective recognition result information based on the respective recognition result information, and output as the new recognition result.
  2. 2. The information processing device according to Claim 1, wherein the integration recognition result outputting means is configured to, when the respective recognition result information include information that correspond to each other im accordance with a predetermined standard, integrate the respective recognition result information and output as the new recognition result.
  3. 3. The information processing device according to Claim 1 or 2, comprising a definition information storing means for storing: recognition result definition information that defines a data format of the recognition result information acquired by the recognition result acquiring means from each of the plurality of recognition engines; and integration definition information that defines an integration format in which the integration recognition result outputting means integrates the respective recognition result information, wherein: the recognition result acquiring means is configured to acquire the respective recognition result information in the data format defined by the recognition result definition mformation; and the integration recognition result outputting means is configured to integrate the respective recognition result information in the integration format defined by the integration definition information, and output the new recognition result,
  4. 4. The information processing device according to any of Claims 1 to 3, wherein the recognition result information acquired by the recognition result acquiring means from each of the plurality of recognition engines is in a graph-structured or tree-structured data format.
  5. 5. The information processing device according to Claim 2, wherein: the recognition result acquiring means is configured to acquire the recognition result information from an object recognition engine that recognizes an object from the recognition target data and a flow recognition engine that recognizes a flow representing a trajectory of a predetermined object from the recognition target data, respectively; and the integration recognition result outputting means is configured to integrate by relating flow information of a predetermined object with object information and output as the new recognition result, the flow information being the recognition result information acquired from the flow recognition engine, and the object information being the recognition result information acquired from the object recognition engine.
  6. 6. The information processing device according to Claim 5, wherein: the recognition result acquiring means is configured to: acquire the recognition result information including location information representing a position of an object recognized from the recognition target data by the object recognition engine and also representing a date and time when the object is recognized, from the object recognition engine; and also acquire flow information of a predetermined object recognized from the recognition target data by the flow recognition engine, from the flow recognition engine; and the integration recognition result acquiring means is configured to, when the flow information acquired from the flow recognition engine includes information that corresponds to the location information acquired from the object recognition engine in accordance with a predetermined standard, integrate by relating the flow information that is the recognition result acquired from the flow recognition engine with object information identified by the recognition result information that is acquired from the object recognition engine and that includes the information corresponding to the location information, and output as the new recognition result.
  7. 7. The information processing device according to Claim 2, wherein:
    the recognition result acquiring means is configured to acquire the recognition result information from a plurality of object recognition engines recognizing an object from the oo recognition target data in different recognition processes, respectively; and the integration recognition result outputting means is configured to integrate by identifying one object information based on respective object information that are the respective recognition result information acquired from the plurality of object recognition engines, and output the new recognition result.
  8. 8. The information processing device according to Claim 7, wherein: the recognition result acquiring means is configured to acquire the recognition result information including object candidate information representing a candidate of an object recognized from the recognition target data by each of the plurality of object recognition engines, from each of the plurality of object recognition engines; and the integration recognition result acquiring means is configured to identify object information that corresponds to each other in accordance with a predetermined standard in the respective object candidate information acquired from the plurality of object recognition engines, and output as the new recognition result.
  9. 9. A computer program comprising instructions for causing an information processing device to realize: a recognition result acquiring means for acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data; and an integration recognition result outputting means for outputting a new recognition result obtained by integrating the respective recognition result information acquired from the plurality of recognition engines, the computer program also comprising instructions for: causing the recognition result acquiring means to acquire the respective recognition result information in a data format common to the plurality of recognition engines, from the plurality of recognition engines; and causing the integration recognition result outputting means to integrate the respective recognition result information based on the respective recognition result information and output as the new recognition result.
  10. 10. The computer program according to Claim 9, comprising instructions for causing the integration recognition result outiputting means to, when the respective recognition result information include information that correspond to each other in accordance with a predetermined standard, integrate the respective recognition result information and output as the new recognition result.
  11. 11. An information processing method comprising: acquiring respective recognition result information outputted by a plurality of recognition engines executing different recognition processes on recognition target data, the respective recognition result information being in a data format common to the plurality of recognition engines; and integrating the respective recognition result information based on the respective recognition result information acquired from the plurality of recognition engines, and outputting as a new recognition result.
  12. 12. The information processing method according to Claim 11, comprising: integrating the respective recognition result information and outputting as the new recognition result, when the respective recognition result information include information corresponding to each other in accordance with a predetermined standard.
SG2013062302A 2011-02-17 2012-01-19 Information processing device SG192807A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2011031613 2011-02-17
PCT/JP2012/000291 WO2012111252A1 (en) 2011-02-17 2012-01-19 Information processing device

Publications (1)

Publication Number Publication Date
SG192807A1 true SG192807A1 (en) 2013-09-30

Family

ID=46672201

Family Applications (1)

Application Number Title Priority Date Filing Date
SG2013062302A SG192807A1 (en) 2011-02-17 2012-01-19 Information processing device

Country Status (7)

Country Link
US (1) US9213891B2 (en)
JP (1) JP6142533B2 (en)
CN (1) CN103380442B (en)
BR (1) BR112013020345B1 (en)
RU (1) RU2562441C2 (en)
SG (1) SG192807A1 (en)
WO (1) WO2012111252A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105182763A (en) * 2015-08-11 2015-12-23 中山大学 Intelligent remote controller based on voice recognition and realization method thereof
JP7363163B2 (en) * 2019-07-26 2023-10-18 日本電気株式会社 Monitoring device, monitoring method, program, and monitoring system
WO2021056050A1 (en) * 2019-09-24 2021-04-01 NETGENIQ Pty Ltd Systems and methods for identifying and/or tracking objects
JP7379059B2 (en) * 2019-10-02 2023-11-14 キヤノン株式会社 Intermediate server device, information processing device, communication method
CN112163583A (en) * 2020-09-25 2021-01-01 珠海智通信息技术有限公司 Method for recognizing digital meter reading, recognition device and computer readable storage medium
US11776275B2 (en) * 2021-10-11 2023-10-03 Worlds Enterprises, Inc. Systems and methods for 3D spatial tracking

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2874032B2 (en) * 1990-11-27 1999-03-24 富士通株式会社 Software work tools
US5963653A (en) * 1997-06-19 1999-10-05 Raytheon Company Hierarchical information fusion object recognition system and method
JP4239635B2 (en) * 2003-03-20 2009-03-18 ソニー株式会社 Robot device, operation control method thereof, and program
JP4450306B2 (en) * 2003-07-11 2010-04-14 Kddi株式会社 Mobile tracking system
JP3931879B2 (en) 2003-11-28 2007-06-20 株式会社デンソー Sensor fusion system and vehicle control apparatus using the same
EP1686766B1 (en) 2005-01-28 2007-06-06 Research In Motion Limited Automated integration of content from multiple information stores using a mobile communication device
RU2315352C2 (en) 2005-11-02 2008-01-20 Самсунг Электроникс Ко., Лтд. Method and system for automatically finding three-dimensional images
JP5072655B2 (en) * 2008-03-03 2012-11-14 キヤノン株式会社 Image processing apparatus, image processing method, program, and storage medium
JP5282614B2 (en) 2009-03-13 2013-09-04 オムロン株式会社 Model data registration method and visual sensor for visual recognition processing
JP2011008791A (en) * 2010-06-30 2011-01-13 Advanced Telecommunication Research Institute International Object recognition method of robot
CN103459099B (en) * 2011-01-28 2015-08-26 英塔茨科技公司 Mutually exchange with a moveable tele-robotic

Also Published As

Publication number Publication date
US9213891B2 (en) 2015-12-15
BR112013020345A2 (en) 2017-08-08
JPWO2012111252A1 (en) 2014-07-03
US20140037150A1 (en) 2014-02-06
RU2013138366A (en) 2015-02-27
CN103380442A (en) 2013-10-30
RU2562441C2 (en) 2015-09-10
CN103380442B (en) 2016-09-28
JP6142533B2 (en) 2017-06-07
BR112013020345B1 (en) 2021-07-20
WO2012111252A1 (en) 2012-08-23

Similar Documents

Publication Publication Date Title
US9213891B2 (en) Information processing device
CN111950424B (en) Video data processing method and device, computer and readable storage medium
US11315366B2 (en) Conference recording method and data processing device employing the same
CN112088402A (en) Joint neural network for speaker recognition
US10878819B1 (en) System and method for enabling real-time captioning for the hearing impaired via augmented reality
CN106294774A (en) User individual data processing method based on dialogue service and device
US20140147023A1 (en) Face Recognition Method, Apparatus, and Computer-Readable Recording Medium for Executing the Method
KR20070118038A (en) Information processing apparatus, information processing method, and computer program
US20120002878A1 (en) Image processing apparatus, method, and program that classifies data of images
KR20210088680A (en) Video cutting method, apparatus, computer equipment and storage medium
KR101617649B1 (en) Recommendation system and method for video interesting section
KR20160011916A (en) Method and apparatus of identifying user using face recognition
CN110941992B (en) Smile expression detection method and device, computer equipment and storage medium
KR102213494B1 (en) Apparatus and method for identifying action
CN115828112A (en) Fault event response method and device, electronic equipment and storage medium
CN109558788A (en) Silent voice inputs discrimination method, computing device and computer-readable medium
TWI759099B (en) A method and device for detecting tampered images
CN111737670A (en) Multi-mode data collaborative man-machine interaction method and system and vehicle-mounted multimedia device
CN112017633A (en) Voice recognition method, voice recognition device, storage medium and electronic equipment
CN111881740A (en) Face recognition method, face recognition device, electronic equipment and medium
KR20080046490A (en) Method for identifying face using montage and apparatus thereof
CN116883900A (en) Video authenticity identification method and system based on multidimensional biological characteristics
CN110415689B (en) Speech recognition device and method
CN111027434A (en) Training method and device for pedestrian recognition model and electronic equipment
CN115810209A (en) Speaker recognition method and device based on multi-mode feature fusion network