JP5343652B2 - Operation screen control apparatus, image forming apparatus, and computer program - Google Patents

Operation screen control apparatus, image forming apparatus, and computer program Download PDF

Info

Publication number
JP5343652B2
JP5343652B2 JP2009071719A JP2009071719A JP5343652B2 JP 5343652 B2 JP5343652 B2 JP 5343652B2 JP 2009071719 A JP2009071719 A JP 2009071719A JP 2009071719 A JP2009071719 A JP 2009071719A JP 5343652 B2 JP5343652 B2 JP 5343652B2
Authority
JP
Japan
Prior art keywords
operation screen
screen
operation
voice
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2009071719A
Other languages
Japanese (ja)
Other versions
JP2010224890A (en
Inventor
光晴 永井
潤 國岡
広明 久保
歩 伊藤
Original Assignee
コニカミノルタ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by コニカミノルタ株式会社 filed Critical コニカミノルタ株式会社
Priority to JP2009071719A priority Critical patent/JP5343652B2/en
Publication of JP2010224890A publication Critical patent/JP2010224890A/en
Application granted granted Critical
Publication of JP5343652B2 publication Critical patent/JP5343652B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention relates to an operation screen control device, an image forming apparatus, and a computer program for operating an operation screen displayed on a display surface of an operation panel by voice.

  In recent years, as the performance of image forming apparatuses has improved, the contents of operation screens (setting screens) displayed on the operation panel of the image forming apparatus have become diversified, and the number of operation screens has increased. In addition, the hierarchy of the tree structure of the operation screen tends to be deep, and it is difficult for the user to search for a favorite page or a frequently used operation screen.

  Therefore, for example, when it is desired to transition from the currently displayed operation screen to another operation screen of a different series, it is necessary to display many screens during the transition, and it is difficult to move to the target operation screen. As described above, displaying the target operation screen takes time and effort for the user, and the operation time for that is increased.

  Further, it has been proposed that the image forming apparatus has a voice recognition function. In an image forming apparatus having a voice recognition function, a user can perform an operation on an operation screen by voice. For example, in the state where a certain operation screen is displayed, instead of pressing a key displayed on the operation screen, the same operation can be performed as when the key is pressed by a voice command.

  In addition, it has been proposed to register cluster information indicating that word speech is a recognition target in all modes regardless of a specific mode of the device in the speech recognition device (Patent Document 1).

Japanese Patent Laid-Open No. 5-35291

  In an image forming apparatus having a voice recognition function, operations on the operation screen can be performed by voice, so it is convenient to register a voice command for transition to the target operation screen for each operation screen. .

  However, in that case, the user must memorize a large number of voice commands, and the user may forget the voice commands and cannot operate.

  In addition, it is conceivable to display a key for making a direct transition to the target operation screen on the operation screen as a favorite screen. However, in this case, since the screen size is limited, when there are many favorite screens, it may not be possible to display all of them on one operation screen.

  Moreover, in the speech recognition apparatus described in Patent Document 1 described above, by registering such cluster information, registered word speech can be recognized in all modes, but from any screen. The operation screen desired by the user cannot be changed in a short time without trouble.

  In view of such a problem, the present invention reduces the time and effort of the user from the state in which any operation screen is displayed when operating the operation screen displayed on the display surface of the operation panel by voice. The purpose is to enable transition to the target operation screen in time.

  An operation screen control device according to an embodiment of the present invention includes a voice input unit that allows a user to input a voice, a voice recognition unit that recognizes the input voice and generates a voice command, and a display surface of the operation panel An operation screen storage means for storing a plurality of operation screens prepared for display on the screen, and a voice command and link information to the operation screen to be changed by each voice command, corresponding to each of the operation screens; When the voice command is generated by the user's voice and the dictionary table for registering in association with each other, the generated voice command is referred to by referring to the dictionary table corresponding to the operation screen displayed at that time. The screen switching means for transitioning to the corresponding operation screen and the voice command for transitioning to the specific operation screen are additionally registered in the dictionary table. When there is a command indicating that, having a bulk registration means for the link information to voice commands and the specific operation screen additionally registers, additionally registers all of the dictionary table in question.

Then, the key information about the keys displayed on the operation screen, an operation screen selection table that stores in association with the operation screen information of the transition destination operation screen transition is provided when the operation of the key is made In the dictionary table, key information about keys displayed on any of the operation screens is registered as the link information, and the screen switching means is obtained by referring to the dictionary table. By using the received key information and referring to the operation screen selection table, the operation screen corresponding to the input voice command is changed.

  According to the present invention, when operating the operation screen displayed on the display surface of the operation panel by voice, the target operation screen can be quickly displayed without any effort from the user regardless of which operation screen is displayed. You can transition.

1 is a diagram illustrating an example of a network system including an image forming apparatus according to an embodiment of the present invention. 1 is a diagram illustrating an example of a hardware configuration of an image forming apparatus according to an embodiment. It is a figure which shows the example of an operation panel. 1 is a diagram illustrating an example of a configuration of an image forming apparatus. It is a figure which shows the example of a structure of the module regarding a speech recognition function. It is a figure which shows the example of a functional structure of an operation screen control part. It is a figure which shows the example by which the dictionary table is provided corresponding to each screen. It is a figure which shows the example of the dictionary table corresponding to a basic setting screen. It is a figure which shows the example of the dictionary table corresponding to a magnification screen. It is a figure which shows the example of the dictionary table corresponding to a screen / page aggregation screen. It is a figure which shows the example of an operation screen selection table. It is a figure which shows the example of the transition state of the operation screen containing a reduction screen. It is a figure which shows the example of the dictionary table in which command information was additionally registered. 10 is a flowchart for explaining an example of a flow of processing of the entire image forming apparatus when a favorite screen is registered. 10 is a flowchart for explaining an example of a flow of processing of the entire image forming apparatus when a voice command is input. It is a figure which shows the example of the hierarchical structure of an operation screen. It is a figure which shows the example of the operation screen of the highest hierarchy which shows the state of the setting operation in each operation screen. 10 is a flowchart for explaining an example of the processing flow of the entire image forming apparatus when an operation screen in the middle of a setting operation is registered in the same manner as a favorite screen. It is a figure which shows the example of a functional structure of the operation screen control part in the image forming apparatus of 2nd Embodiment. It is a figure which shows the example of a dedicated dictionary table. 10 is a flowchart for explaining an example of a flow of processing of the entire image forming apparatus in the second embodiment.

[First Embodiment]
A network system SYS shown in FIG. 1 is a network system including an image forming apparatus 1 according to an embodiment of the present invention.

  As shown in FIG. 1, the network system SYS includes an image forming apparatus 1, a file server 2a, a terminal apparatus 3a, a communication line NT1, and the like.

  The image forming apparatus 1, the file server 2a, and the terminal device 3a are connected to each other via a communication line NT1. As the communication line NT1, a LAN, WAN, intranet, dedicated line, or the like is used.

  The file server 2a is a file server for managing electronic documents (electronic data).

  The terminal device 3 is a terminal device used by a user of the network system SYS. Each user can use the “PC print function” of the image forming apparatus 1 described later using the terminal device 3.

  The fax terminal 4a transmits / receives data by facsimile to / from the image forming apparatus 1 or another fax terminal via the communication line NT2. A public line or the like is used as the communication line NT2.

  The image forming apparatus 1 is an image forming apparatus that integrates various application functions such as a copy function, a PC print function, a fax transmission function, an e-mail transmission function, an SMB transmission function, an FTP transmission function, and a box function. . Sometimes called a multi-function peripheral or MFP (Multi Function Peripherals).

  The image forming apparatus 1 of the present embodiment is provided with a voice execution function in addition to the basic functions such as the copy function described above. The voice execution function is a function that executes voice recognition processing (VN) and executes predetermined processing based on the result of the voice recognition processing. The voice recognition process is for recognizing voice input from the microphone MK provided in the operation panel 10f of the image forming apparatus 1 shown in FIG. 1 based on a recognition dictionary (grammar) stored in the storage medium. It is processing.

  As shown in FIG. 2, the image forming apparatus 1 includes a CPU 10a, a RAM 10b, a ROM 10c, a hard disk 10d, a control circuit 10e, an operation panel 10f, a scanner unit 10g, a printer unit 10h, a fax unit 10j, a network interface 10k, and the like. Is done.

  The scanner unit 10g is an apparatus that optically reads an image such as a photograph, a character, a picture, or a chart drawn on a document sheet (hereinafter simply referred to as “document”) and converts it into electronic data. As the manuscript, it is possible to use paper, various sheets and objects other than paper.

  The printer unit 10h prints image data read by the scanner unit 10g or image data transmitted from another device on a sheet using toners of four colors of yellow, magenta, cyan, and black.

  The fax unit 10j transmits the image data read by the scanner unit 10g to the fax terminal via the communication line or receives the image data transmitted from the fax terminal.

  The network interface 10k is a NIC (Network Interface Card), connects to other devices via a LAN or the Internet, and transmits and receives electronic data.

  The control circuit 10e controls devices such as the hard disk 10d, the operation panel 10f, the scanner unit 10g, the printer unit 10h, the fax unit 10j, and the network interface 10k.

  The operation panel 10f is used when a user gives an instruction to the image forming apparatus 1 or notifies the user of a message from the image forming apparatus 1.

  The application functions described above are realized by cooperation of these units or a part of the devices.

  The “copy function” is a function of reading an image of a document by the scanner unit 10g and printing the image data obtained thereby on a sheet by the printer unit 10h.

  The “PC print function” receives image data and the like from a terminal device (hereinafter simply referred to as “terminal device”) connected to the image forming apparatus 1 via a communication line via the network interface 10k, and the like. This is a function for printing the image data and the like on a sheet by the section 10h. This function is sometimes referred to as “network printing”.

  The “fax transmission function” is a function for transmitting image data read by the scanner unit 10g from the fax unit 10j to a fax terminal or the like.

  The “fax receiving function” is a function for receiving image data from a fax terminal or the like by the fax unit 10j and printing the image data on a sheet by the printer unit 10h.

  The “e-mail transmission function” is a function that attaches image data read by the scanner unit 10g to an e-mail and transmits it from the network interface 10k.

  The “SMB transmission function” is a function for directly transmitting image data read by the scanner unit 10g to a transmission destination designated by an IP address or the like based on SMB (Server Message Block).

  The “FTP transmission function” is a function for transmitting image data read by the scanner unit 10g based on FTP (File Transfer Protocol).

  In the “box function”, a storage area called a “box” or “personal box” for each user, which corresponds to a folder or directory in a personal computer, is provided in the hard disk 10d, and the user operates the scanner unit 10g. This is a function that allows image data obtained in this way to be saved in its own storage area. Sometimes called “document server function”.

  By specifying a box in advance in the terminal device when using the PC print function, image data to be printed can be stored in the box while printing. The box can be designated using, for example, a driver function of the image forming apparatus 1 installed in the terminal device.

  As shown in FIG. 3, the operation panel 10f is provided with a touch panel TP, a microphone MK, an operation unit TK, a voice recognition button KB, a copy button MB, a scan FAX button SB, a BOX button XB, and the like.

  The touch panel TP is a display device that displays a message notified to the user from the image forming apparatus 1 or a screen (operation screen GM) for using various functions. It also plays a role as an input device by the function of the touch panel.

  The microphone (microphone) MK is a device for a user to input voice (voice command), and converts the input voice into analog voice data (analog voice data ANDT).

  The operation unit TK includes a button group for inputting the number of prints or a fax number.

  The voice recognition button KB is a button for switching to a voice recognition mode (a mode for performing voice recognition). When this button is pressed, the image forming apparatus 1 executes voice recognition processing, and executes predetermined processing based on the result of voice recognition.

  The copy button MB is a button for switching to a print mode (a mode for printing image data or the like on paper). When this button is pressed, a screen for performing settings for copying (copy setting screen) is displayed on the touch panel TP.

  The scan FAX button SB is a button for switching to a transmission mode (a mode for transmitting image data or the like to another apparatus). When this button is pressed, a screen for setting a transmission destination such as image data is displayed on the touch panel TP.

  The BOX button XB is a button for switching to a box mode (a mode for storing image data in a box or referring to image data stored in a box). When this button is pressed, a screen for designating a box as a storage destination of image data or the like or a box to be referred to is displayed on the touch panel TP.

  FIG. 4 shows the configuration of the image forming apparatus 1 functionally organized.

  In FIG. 4, the input image processing unit 22 performs image processing such as color correction processing and resolution conversion processing on the image data read by the scanner unit 10g as necessary, and then stores them in the memory 10b. . The image data is read from the memory 10b as necessary, and the output image processing unit 23 performs color conversion processing, screen control processing, smoothing processing, density correction processing, and the like as necessary, and sends them to the printer unit 10h. It is done. Printing is performed on a sheet in the printer unit 10h.

  The user's voice input from the microphone MK provided on the operation panel 10f is converted into voice data by the voice input unit 24, input to the CPU 10a via the system IF 21, and processed.

  A plurality of operation screens GM for operating the image forming apparatus 1 are stored in advance in the hard disk 10d, and necessary operations are selected in accordance with user operations and in response to user voice input. The screen GM is read and displayed on the touch panel TP of the operation panel 10f.

  The CPU 10a controls the entire image forming apparatus 1 in addition to performing voice recognition processing and operation screen control based on the input voice.

  For example, in operation screen control, when a voice command is generated by a user's voice, an operation screen corresponding to the generated voice command is referred to by referring to a dictionary table corresponding to the operation screen displayed at that time When there is a command to additionally register in the dictionary table a screen switching process for transitioning to and a voice command for transitioning to a specific operation screen, the voice command to be additionally registered and link information to the specific operation screen Is registered in all the dictionary tables to be registered, and so on. Details will be described later.

  Computer programs for realizing these functions or processes are recorded in a recording medium MS1 such as a CD-ROM or DVD-ROM and a portable recording medium MS such as a semiconductor memory MS2 (see FIG. 2). It is possible to install from. It is also possible to download from a server via a network.

  Computer programs and data for realizing these functions are installed in the hard disk 1g, loaded into the RAM 10b as necessary, and executed by the CPU 10a. The CPU 10a can operate as a computer. Further, it may be connected to an external computer via a LAN or the like. Further, part or all of the functions of the image forming apparatus 1 may be realized by a digital processor or a hardware circuit.

  In FIG. 5, the image forming apparatus 1 is provided with a voice recognition application 31 for executing voice recognition processing. That is, the voice input from the voice input device (microphone) MK is converted into voice data and then input to the voice recognition application 31 by the voice input driver 34 via the OS (Operating System) 33.

  The voice recognition application 31 manages the voice recognition library 32, recognizes the voice uttered by the user, and generates a voice command (character string). Moreover, it converts into the internal operation | movement corresponding to a voice command.

  The voiceprint discrimination unit 35 analyzes the waveform of the voice command and performs voiceprint discrimination. By the voiceprint discrimination, a voice ID that is voiceprint information is generated. The voice ID is used to specify the user who issued the voice command.

  The voice command generated for registration is additionally registered in the dictionary table TB of the voice recognition library 32. When a voice command is generated for control, operation screen control and other controls are performed based on the generated voice command. The system IF 36 stores and reads various setting information.

  Hereinafter, processing and operation of the operation screen control unit SS will be described with reference to FIGS. Here, an example will be described in which the screen frequently used by the user (hereinafter sometimes referred to as “favorite screen”) is “reduced screen GM33” and “page aggregation screen GM34”.

  6 is a diagram illustrating an example of a functional configuration of the operation screen control unit SS in the image forming apparatus 1, FIG. 7 is a diagram illustrating an example in which a dictionary table is provided for each screen, and FIG. 8 is a basic setting. FIG. 9 is a diagram illustrating an example of the dictionary table TB31 corresponding to the magnification screen GM31. FIG. 10 is an example of the dictionary table TB32 corresponding to the screen / page aggregation screen GM32. FIG. 11 is a diagram showing an example of the operation screen selection table TB50, FIG. 12 is a diagram showing an example of the transition state of the operation screen including the reduced screen GM33 and the page aggregation screen GM34, and FIG. FIG. 14 is a diagram illustrating an example of a dictionary table TB in which command information DT is additionally registered. FIG. 14 is a flow of processing of the entire image forming apparatus 1 when a favorite screen is registered. 15 is a flowchart for explaining an example, FIG. 15 is a flowchart for explaining an example of the processing flow of the entire image forming apparatus 1 when a voice command is input, and FIG. 16 is a state in which the hierarchy of the operation screen is deep and complicated FIG. 17 is a diagram showing an example of the highest level operation screen showing the state of the setting operation on each operation screen, and FIG. 18 is an operation screen in the middle of the setting operation similar to the favorite screen. 6 is a flowchart for explaining an example of a flow of processing of the entire image forming apparatus 1 when registering in the image processing apparatus.

  In FIG. 6, the operation screen control unit SS includes an operation screen storage unit 40, a voice recognition library 41, a screen display control unit 42, a voice print discrimination unit 43, a command recognition unit 44, a batch registration unit 45, a general control unit 46, and an image. A processing unit 47 and the like are included.

  Note that the voice recognition processing (VD) described below is mainly executed by the voice recognition library 41, the voiceprint discrimination unit 43, the command recognition unit 44, and the voice input unit 24. The voice input unit 24 is also one of the functional configurations of the operation screen control unit SS.

  In FIG. 6, the operation screen storage unit 40 stores an operation screen GM for the user to give an instruction to the image forming apparatus 1 or to notify the user of a message from the image forming apparatus 1. Each operation screen GM is assigned a screen number for distinguishing it from other operation screens GM.

  The speech recognition library 41 stores and manages a dictionary table TB (TB30, 31, 32,...). The dictionary table TB is provided in association with each operation screen GM, as shown in FIG.

  Here, the dictionary table TB will be described in detail. 8 to 10 show examples of dictionary tables TB30, TB31, and TB32 corresponding to the basic setting screen GM30, the magnification screen GM31, and the screen / page aggregation screen GM32 shown in FIG. The dictionary tables TB30, TB31, and TB32 have the same configuration, and part of the contents are different from each other.

  That is, in the dictionary table TB30, as shown in FIG. 8, command information DT30 that is information related to the voice command is stored and managed. In the dictionary tables TB31 and 32, command information DT31 and 32 are stored and managed, respectively.

  The command information DT30 is displayed on the screen of the screen number indicated by the link information (key information) KW and the screen number GRN relating to the operation screen of the transition destination when the operation screen GM is transitioned according to the input voice command. An operation key name (operation key name) or a remark GW indicating an operation item name and a voice command SW input by voice are recorded in association with each other. In the screen number GRN, the screen number of the operation screen associated with the dictionary table TB is recorded.

  That is, as the voice command SW, a phrase (command phrase) in which the same word or phrase as the operation key name indicated in the remarks GW or a change made in consideration of fluctuation of a word (phrase) uttered by the user with respect to the operation key name Is recorded. It is possible to record a plurality of voice commands SW corresponding to one link information KW.

  In addition, key information Y21 to Y27 of operation keys displayed on the basic setting screen GM30 shown in FIG. 7 is recorded as the link information KW.

  Therefore, for example, when the input voice command matches any one of the voice commands SW recorded in the dictionary table TB30, the screen transitions to the operation screen having the screen number associated with the key information Y indicated in the link information KW. The process for performing is performed. That is, the process is executed as if the operation key having the operation key name indicated in the remarks GW associated with the voice command SW is pressed.

  Returning to FIG. 6, the screen display control unit 42 performs a process for displaying a predetermined operation screen on the operation panel 10 f at a timing when a voice is input or a button provided on the operation panel 10 f is pressed. Do. Further, the screen display control unit 42 manages the operation screen selection table TB50, and refers to the operation screen selection table TB50 based on the control signal from the general control unit 46, so that the currently displayed operation screen is displayed. Performs processing for transitioning to the operation screen corresponding to the input voice command.

  The operation screen selection table TB50 stores and manages transition destination screen information DT50. As shown in FIG. 11, the transition destination screen information DT50 indicates the link information KW and the transition destination screen number RW in association with each other. Key information Y of operation keys is recorded as link information KW. The transition destination screen number RW is the screen number of the transition destination operation screen GM when the key indicated by the key information Y is pressed.

  Therefore, when a voice is input from the user, the command recognition unit 44 recognizes the voice and generates the voice command SW, and then refers to the dictionary table TB30 to obtain the key information Y corresponding to the generated voice command SW. get. The acquired key information Y is sent to the general control unit 46. The comprehensive control unit 46 sends the key information Y to the screen display control unit 42 as the control signal SN1.

  The screen display control unit 42 searches the operation screen selection table TB50 based on the key information Y, and acquires the transition destination screen number RW corresponding to the key information Y. Then, the operation screen GM of the acquired transition destination screen number RW is displayed on the touch panel TP.

  For example, in the transition destination screen information DT50 of FIG. 11, the screen number 10G09000 is associated with the key information Y24 that is one of the link information KW. When receiving the key information Y24 as the control signal SN1, the screen display control unit 42 extracts the magnification screen GM31, which is the operation screen GM having the screen number 10G09000, from the operation screen storage unit 40 and displays it.

  As another example, when the screen display control unit 42 receives the key information Y25 as the control signal SN1, the screen / page aggregation screen GM32 with the screen number 10G12000 is extracted from the operation screen storage unit 40 and displayed.

  Returning to FIG. 6, the voiceprint discrimination unit 43 discriminates the user's voiceprint and generates voiceprint information. That is, the voiceprint determination unit 43 performs a process for determining the voiceprint of the user who has input the voice command. For example, a voice command is input by measuring and analyzing the time from the start of speech to the end when the user utters a voice command (speech time), speech frequency spectrum, speech volume, etc. A process for determining which user the user is is performed. Then, a determination result signal SN 2 indicating which user is the user who has input the voice command is sent to the command recognition unit 44.

  When a voice is input from the user, the command recognition unit 44 recognizes the voice and generates a voice command SW. Then, the key information Y corresponding to the generated voice command SW is acquired with reference to the dictionary table TB30. When referring to the dictionary table TB30, the discrimination result signal SN2 is also used to obtain key information Y that matches both the voice command SW and the voiceprint information. The acquired key information Y is sent to the general control unit 46 as a recognition result signal SN3. The comprehensive control unit 46 sends the key information Y to the screen display control unit 42 as the control signal SN1.

  In the speech recognition, the command recognition unit 44 first converts the speech data DGDT acquired from the speech input unit 24 into a character string (hereinafter sometimes referred to as “recognition target phrase”), and stores it in the dictionary table TB30. From the recorded voice commands SW, those that match the recognition target words are extracted. When the voice command SW that matches the recognition target word / phrase is extracted, the link information KW associated with the voice command SW is read out.

  When there is a command to additionally register a voice command for making a transition to a specific operation screen GM in the dictionary table TB, the collective registration unit 45 additionally registers the voice command and link information to the specific operation screen GM. Processing for additionally registering KW in all dictionary tables TB stored in the speech recognition library 41 (command registration processing) is performed.

  Next, command registration processing will be described together with user operation procedures.

  The command registration process is started by a favorite screen registration operation by the user.

  First, the user determines a favorite screen. In the first embodiment, the reduced screen GM33 and the page aggregation screen GM34 are user favorite screens. An example for transitioning to the reduced screen GM33 and the page aggregation screen GM34 is shown in FIG.

  In FIG. 12, for example, when the magnification key (button) BTN10 is pressed on the basic setting screen GM30 or when a voice command corresponding to the magnification key BTN10 is input by voice, the screen changes to the magnification screen GM31. When the reduction key BTN12 is pressed on the magnification screen GM31 or when a voice command corresponding to the reduction key BTN12 is input by voice, the screen changes to the reduction screen GM33.

  In addition, when the screen / page aggregation key BTN11 is pressed on the basic setting screen GM30 or when a voice command corresponding to the screen / page aggregation key BTN11 is input by voice, the screen / page aggregation screen GM32 is transitioned to. . When the page aggregation key BTN13 is pressed on the screen / page aggregation screen GM32 or when a voice command corresponding to the page aggregation key BTN13 is input by voice, the screen transitions to the page aggregation screen GM34.

  There are two operation methods for the favorite screen registration operation. One is a method of designating an operation key (an operation key displayed on the operation screen immediately before the favorite screen) for transitioning to the favorite screen, and registering a voice command composed of an arbitrary word / phrase in the designated operation key. It is.

  When adopting this method, the user first presses the voice recognition button KB provided on the operation panel 10f for a long time. As a result, the image forming apparatus 1 determines that there has been a command to additionally register a voice command for changing to a specific operation screen in the dictionary table TB, and shifts to the voice registration mode.

  The user performs an operation for displaying the operation screen GM on which operation keys for transitioning to the favorite screen are displayed. For example, when the favorite screen is “reduced screen GM33”, the magnification screen GM31 which is the previous screen is displayed. Then, on the magnification screen GM31, an operation key for changing to the reduction screen GM33, that is, the reduction key BTN12 is pressed and designated. When the favorite screen is “page aggregation screen GM34”, the screen / page aggregation screen GM32 is displayed, and an operation key for transitioning to the page aggregation screen GM34, that is, the page aggregation key BTN13 is pressed and designated.

  When the operation key is designated, the operation screen control unit SS displays a screen showing a message (voice message) for uttering a voice command to be registered to the user. Alternatively, a pre-registered voice guide is played from a speaker (not shown) provided in the image forming apparatus 1 to urge the user to speak.

  When the screen showing the utterance message is displayed or the voice guide is played, the user utters and inputs the voice by pressing the voice recognition button KB lightly.

  When voice is input, the voiceprint discrimination unit 43 generates voiceprint information. The voiceprint information is sent to the batch registration unit 45 as, for example, voice ID information VD.

  In addition, with the processing by the voiceprint discrimination unit 43, the command recognition unit 44 generates a voice command SW corresponding to the input voice. The generated voice command SW is sent to the batch registration unit 45.

  When the voice command SW and the voice ID information VD are acquired, the batch registration unit 45 executes a process for registering in all dictionary tables TB stored in the voice recognition library 41. At this time, in addition to the voice command SW and voice ID information VD, link information KW corresponding to the operation key designated by the user is also registered.

  For the link information KW, for example, the link information KW associated with the operation key designated by the user is searched from the dictionary table TB associated with the operation screen on which the operation key is displayed. For example, when the user designates the reduction key BTN12, the link information associated with the key from the dictionary table TB31 (see FIG. 9) associated with the magnification screen GM31 on which the reduction key BTN12 is displayed. The KW “Y33” is searched and extracted (see FIG. 7).

  The voice command SW, the voice ID information VD, the extracted link information KW, and the operation item name of the operation key designated by the user are all stored in the voice recognition library 41 as new command information DT for the favorite screen. Are additionally registered in the dictionary table TB (see FIG. 13).

  The second operation method of the favorite screen registration operation is a method of displaying and operating a screen to be registered as a favorite screen.

  When this method is adopted, the user displays an operation screen to be registered as a favorite screen, and presses the screen name portion of the operation screen. For example, in the case of the reduced screen GM33, the screen name portion MS of the reduced screen GM33 shown in FIG. 12 is pressed.

  Then, the operation screen control unit SS searches the operation screen storage unit 40 for an operation screen immediately before the operation screen (favorite screen) on which the screen name portion MS is pressed. Then, an operation key for transitioning to the favorite screen is searched from the operation keys displayed on the searched operation screen, and the user operates the previous one on the favorite screen as in the first operation method described above. The same processing as that when the operation key displayed on the screen is designated is executed.

  The user can select which operation method to use.

  Returning to FIG. 6, the general control unit 46 controls the image processing unit 47 and also controls the entire operation screen control unit SS. Further, upon receiving the recognition result signal SN3 from the command recognition unit 44, the general control unit 46 sets whether the recognition result signal SN3 is a signal for transitioning the operation screen or the function provided in the image forming apparatus 1. It is determined whether the signal is a signal to be shown. A control signal is sent to each unit according to the result of the determination. Since the recognition result signal SN3 is a signal for transitioning the operation screen, when the recognition result signal SN3 is received, the fact that the operation screen currently displayed on the screen display control unit 42 should be transitioned and the transition destination are displayed. A control signal SN1 indicating link information KW on the operation screen is sent.

  The image processing unit 47 performs various image processing on the image data read by the scanner unit 10g or the image data transmitted from another device in accordance with a control signal from the general control unit 46.

  Next, processing of the entire image forming apparatus 1 when executing command registration processing will be described with reference to a flowchart of FIG.

  The user long presses the voice recognition button KB provided on the operation panel 10f (# 71). As a result, the image forming apparatus 1 starts processing for registering a favorite screen (command registration processing) (# 72).

  Next, the user selects whether to execute the process for registering the favorite screen by designating the operation key or to display the favorite screen and execute the process. When the process for registering the favorite screen is executed by specifying the operation key (when 1 is selected in # 73), the user displays the operation screen immediately before the favorite screen and displays the favorite screen on the favorite screen. An operation key for transition is pressed and designated (# 74).

  When displaying a screen to be registered as a favorite screen and executing the process (when 2 is selected in # 73), the user displays an operation screen to be registered as a favorite screen (# 82) and displays the operation screen. The screen name portion MS is pressed (# 83).

  When the screen name portion MS of the favorite screen is pressed, the operation screen control unit SS searches the operation screen storage unit 40 for the operation screen immediately before the favorite screen. Then, an operation key for transitioning to the favorite screen is extracted from the operation keys displayed on the searched operation screen, and the process is executed assuming that the extracted operation key is designated.

  When the operation key is designated, the image forming apparatus 1 displays a voice message or outputs a voice, and prompts the user to speak (# 75).

  The user lightly presses the voice recognition button KB (# 76), and utters and inputs a voice command to be registered (# 77). When a voice command is input, a voice ID is generated and saved (# 78). Also, a voice command SW is generated by voice recognition (# 79, # 80). The voice command SW, voice ID, link information KW, and operation item name are additionally registered in all dictionary tables TB as new command information DT for the favorite screen (# 81).

  Next, processing of the entire image forming apparatus 1 in command recognition processing will be described with reference to the flowchart of FIG.

  The user presses (lightly) the voice recognition button KB provided on the operation panel 10f (# 91). As a result, the image forming apparatus 1 determines that there is an instruction to start the command recognition process (# 92). The user inputs voice (# 93).

  The image forming apparatus 1 executes voice recognition processing for the input voice and generates a voice command SW (# 94). Then, the generated voice command SW is collated with the voice command SW stored in the dictionary table TB (# 96). Also, voiceprint information is generated for the input voice (# 95).

  If there are a plurality of voice commands SW that match in the dictionary table TB (Yes in # 97), the voice command information is referred to, the voice command SW that matches the voice print information is specified, and one link information KW is extracted ( # 98). Then, the operation screen GM corresponding to the input voice is extracted and displayed with reference to the operation screen selection table TB50 (# 99). In step # 99, if the input voice is not recorded as the voice command SW in the dictionary table TB, processing corresponding to the input voice is performed.

  As described above, in the first embodiment, the command information DT for the favorite screen is additionally registered in all dictionary tables TB stored in the speech recognition library 41, so that the user can use any operation screen. The favorite screen can be displayed by uttering a voice command for changing the favorite screen.

  Therefore, it is possible for the user to simplify and speed up the screen operation. For example, the favorite screen can be easily displayed even when the operation screen has a deep and complicated hierarchy as shown in FIG.

In addition, since the voice information is included in the command information DT for the additionally registered favorite screen, even if the same voice command is registered by a plurality of users, the user who has registered the voice command can be distinguished. .
[Modification of Favorite Screen Registration Operation]
In the first embodiment, the case where the operation screen before the user performs the setting operation is registered as a favorite screen has been described as an example. However, the user registers the operation screen in the middle of the setting operation in the same manner as the favorite screen. You can also.

  For example, the reduced screen GM33 shown in FIG. 12 shows a state in which the copy magnification is set to 93.0%. Further, the page aggregation screen GM34 shown in the figure shows a state in which the print output form is set to two-in-one (division). When the setting operation on the operation screen is temporarily suspended in this state, the user performs an operation for registering like the favorite screen while holding the setting state for the reduced screen GM33 or the page aggregation screen GN34 in the middle of the setting operation. . Thereby, it is possible to call the operation screen in the middle of the setting operation from any operation screen and resume the subsequent setting operation. When the user resumes the setting operation, as shown in FIG. 17, the state of the setting operation on each operation screen is displayed on the operation screen of the highest hierarchy (the basic setting screen GM30 in this embodiment). You can also. You may alert | report with an audio guide.

  Next, processing of the entire image forming apparatus 1 when an operation screen in the middle of a setting operation (hereinafter referred to as “setting screen”) is registered in the same manner as a favorite screen will be described with reference to a flowchart. .

  FIG. 18 is a flowchart for explaining an example of the processing flow of the entire image forming apparatus 1 when the operation screen in the middle of the setting operation is registered in the same manner as the favorite screen.

  In FIG. 18, the user performs a predetermined setting operation on each operation screen (# 111). At this time, when the setting operation is temporarily interrupted, the voice recognition button KB provided on the operation panel 10f is pressed long (# 112). Here, when the user registers the displayed operation screen as a favorite screen as described in the first embodiment without holding the setting state (when 1 is selected in # 113), the operation is performed. The favorite screen registration operation is started by specifying a key or pressing the screen name portion MS of the favorite screen (# 114).

When the setting state of the currently displayed setting screen is to be retained (when 2 is selected in # 113), the user saves the setting state of the setting intermediate screen in the memory 10b (hard disk 10d) or the like. Perform the operation. Then, the image forming apparatus 1 executes the same process as the command registration process described in the first embodiment, assuming that the favorite screen registration operation is started for the setting intermediate screen. The command registration process in this case is the same as the process when the screen name portion MS of the favorite screen is pressed. That is, the image forming apparatus 1 searches the operation screen storage unit 40 for the operation screen immediately before the setting halfway screen. Then, an operation key for transitioning to the setting intermediate screen is searched from the operation keys displayed on the searched operation screen, and the command registration processing in the first embodiment is performed assuming that the searched operation key is pressed and designated. The same processing (# 75 to # 80 in FIG. 14) is executed (# 116 to # 120). Then, the command phrase information ST and the extracted link information KW are additionally registered in all dictionary tables TB stored in the speech recognition library 41 as new command information DT for the favorite screen (# 121). In addition to the command phrase information ST and the extracted link information KW, a voice ID or an operation item name of an operation key may be registered.
[Second Embodiment]
Next, a second embodiment will be described with reference to FIGS. Note that in the second embodiment, only the parts different from the network system SYS and the image forming apparatus 1 in the first embodiment will be described. Moreover, in FIGS. 19-21, the same code | symbol is attached | subjected to the component same as 1st Embodiment, and the detailed description is abbreviate | omitted suitably.

  19 is a diagram illustrating an example of a functional configuration of the operation screen control unit SSB in the image forming apparatus 1 according to the second embodiment, FIG. 20 is a diagram illustrating an example of the dedicated dictionary table TB60, and FIG. 21 is a diagram according to the second embodiment. 4 is a flowchart for explaining an example of a processing flow of the entire image forming apparatus 1.

  In the first embodiment, new command information DT on the favorite screen is associated with all dictionary tables TB stored in the speech recognition library 41 (all operation screens GM stored in the operation screen storage unit 40). The dictionary table TB) is additionally registered in place of the dictionary table TB60 (see FIG. 19), which is different from the dictionary table TB provided in association with the operation screen GM. The new command information DT for the favorite screen may be additionally registered.

  In FIG. 19, the speech recognition library 41B stores and manages a dictionary table TB (TB30, 31, 32,...) And a dedicated dictionary table TB60.

  When the registration unit 45B acquires the voice ID information VD, the voice command SW, and the link information KW, the registration unit 45B uses them as new command information DT (command information DT60) for the favorite screen and is stored in the voice recognition library 41B. It is registered in the dictionary table TB60 (see FIG. 20).

  The command recognition unit 44B basically executes the same process as the command recognition process of the first embodiment. The difference from the command recognition process of the first embodiment is that when executing the command recognition process, first, referring to the command information DT60 stored in the dedicated dictionary table TB60 shown in FIG. Search for a matching voice command SW. If no voice command SW matching the input voice command is found, the command information DT stored in the dictionary table TB associated with the currently displayed operation screen is referred to.

  Only new command information DT (command information DT60) for the favorite screen is registered in the dedicated dictionary table TB60. Therefore, when a voice command for transition to the favorite screen is input, it is possible to search for a voice command SW that matches the voice command input in a shorter time than referring to the dictionary table TB for each operation screen. The command recognition process can be completed earlier.

  FIG. 20 is a flowchart for explaining an example of the processing flow of the entire image forming apparatus 1 in the second embodiment.

  Next, processing of the entire image forming apparatus 1 in the second embodiment will be described with reference to a flowchart of FIG.

  The processing of steps # 131 to 134 is the same as steps # 91 to 94 of FIG. 16 of the first embodiment. The image forming apparatus 1 performs voice recognition processing on the input voice command, and collates it with the voice command SW of the command information DT60 stored in the dedicated dictionary table TB60 (# 136).

  If the voice command SW matching the recognition target word / phrase cannot be searched from the dedicated dictionary table 60, the dictionary table TB corresponding to the currently displayed operation screen GM is then referred to.

  Steps # 135 and 137 to 139 are the same as steps # 95 and 97 to 99 in FIG.

  In the second embodiment, among the voice commands SW registered in the specialized dictionary table TB60 or the dictionary table TB, a voice command SW that matches the input voice command is adopted as a result of the voice recognition process (#). 138).

  In the above-described embodiment, the operation screen control unit SS corresponds to the “operation screen control device” in the present invention, and the voice input unit 24 corresponds to the “voice input unit” in the present invention. The voice recognition library 41, the voiceprint discrimination unit 43, and the command recognition unit 44 correspond to the “voice recognition unit” of the present invention. The screen display control unit 42 corresponds to “screen switching means” in the present invention, and the collective registration unit 45 corresponds to “collective registration means” in the present invention.

  The voice ID information VD corresponds to “voice print information” in the present invention, and the voice print determination unit 43 corresponds to “voice print determination unit” in the present invention. The memory 10b (hard disk 10d) corresponds to a storage device in the present invention.

  In addition, the configurations, functions, contents indicated by each data, table contents, processing contents or order of the operation screen control unit SS, the image forming apparatus 1, and the network system SYS are appropriately changed in accordance with the spirit of the present invention. be able to.

1 Image forming apparatus 24 Voice input unit (voice input means, voice recognition unit)
41, 41B Voice recognition library (voice recognition unit)
42 Screen display control unit (screen switching means)
43 Voiceprint discrimination unit (voice recognition unit, voiceprint discrimination unit)
44, 44B Command recognition unit (voice recognition unit)
45 Batch registration section (Batch registration means)
45B Registration part DT30, 31, 32, 60 Command information
KW link information (key information)
Y21 to Y27 Key information SS, SSB Operation screen controller (operation screen controller)
SW voice command TB30, 31, 32 Dictionary table TB50 Operation screen selection table TB60 Dedicated dictionary table VD Voice ID information (voice print information)

Claims (7)

  1. An operation screen control device for operating the operation screen displayed on the display surface of the operation panel by voice,
    Voice input means for allowing the user to input voice;
    A speech recognition unit that recognizes input speech and generates speech commands;
    Operation screen storage means for storing a plurality of operation screens prepared for display on the display surface of the operation panel;
    A dictionary table that is provided corresponding to each of the operation screens, and registers and associates voice commands and link information to the operation screens to be changed by each voice command;
    Screen switching means for transitioning to an operation screen corresponding to the generated voice command by referring to the dictionary table corresponding to the operation screen displayed at that time when the voice command is generated by the user's voice When,
    When there is a command to additionally register a voice command for transition to a specific operation screen in the dictionary table, the voice command to be additionally registered and link information to the specific operation screen Batch registration means for additionally registering in the dictionary table;
    I have a,
    An operation screen selection table is provided that stores key information about keys displayed on the operation screen in association with operation screen information of a transition destination operation screen that is transitioned when the key is operated. ,
    In the dictionary table, key information about keys displayed on any of the operation screens is registered as the link information,
    The screen switching means uses the key information obtained by referring to the dictionary table, and transitions to an operation screen corresponding to the input voice command by referring to the operation screen selection table.
    An operation screen control device.
  2. In the dictionary table, a plurality of voice commands can be registered corresponding to one link information.
    The operation screen control device according to claim 1 .
  3. The voice recognition unit includes a voiceprint discrimination unit that discriminates a user's voiceprint and generates voiceprint information;
    In the dictionary table, the voiceprint information is registered together with the voice command,
    The screen switching means extracts link information that matches voice command and voiceprint information registered in the dictionary table, and transitions the operation screen based on the extracted link information.
    The operation screen control device according to claim 2 .
  4. The collective registration means additionally registers all the dictionary tables provided corresponding to the operation screen.
    Operation screen control device according to any one of claims 1 to 3.
  5. The batch registration means stores and holds the state at that time in a storage device when there is an instruction to interrupt the process in the process of additional registration.
    Operation screen control device according to any one of claims 1 to 4.
  6. An image forming apparatus provided with an operation screen control device according to any one of claims 1 to 5.
  7. A computer program for a computer provided in an image forming apparatus capable of operating an operation screen displayed on a display surface of an operation panel by voice,
    When executed by the computer, the image forming apparatus
    Voice input means for allowing the user to input voice;
    A speech recognition unit that recognizes input speech and generates speech commands;
    Operation screen storage means for storing a plurality of operation screens prepared for display on the display surface of the operation panel;
    A dictionary that is provided corresponding to each of the operation screens, and that associates and registers voice commands and key information about keys displayed on the operation screen, which is link information to the operation screens to be changed by each voice command. Table,
    Screen switching means for transitioning to an operation screen corresponding to the generated voice command by referring to the dictionary table corresponding to the operation screen displayed at that time when the voice command is generated by the user's voice When,
    When there is a command to additionally register a voice command for transition to a specific operation screen in the dictionary table, the voice command to be additionally registered and the key information which is link information to the specific operation screen, Batch registration means for additionally registering in all target dictionary tables;
    An operation screen selection table for storing key information about keys displayed on the operation screen in association with operation screen information of a transition destination operation screen that is transitioned when the operation of the key is performed;
    Realized
    The screen switching means uses the key information obtained by referring to the dictionary table, and transitions to an operation screen corresponding to the input voice command by referring to the operation screen selection table.
    A computer program characterized by the above.
JP2009071719A 2009-03-24 2009-03-24 Operation screen control apparatus, image forming apparatus, and computer program Active JP5343652B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2009071719A JP5343652B2 (en) 2009-03-24 2009-03-24 Operation screen control apparatus, image forming apparatus, and computer program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2009071719A JP5343652B2 (en) 2009-03-24 2009-03-24 Operation screen control apparatus, image forming apparatus, and computer program

Publications (2)

Publication Number Publication Date
JP2010224890A JP2010224890A (en) 2010-10-07
JP5343652B2 true JP5343652B2 (en) 2013-11-13

Family

ID=43042009

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2009071719A Active JP5343652B2 (en) 2009-03-24 2009-03-24 Operation screen control apparatus, image forming apparatus, and computer program

Country Status (1)

Country Link
JP (1) JP5343652B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160150124A1 (en) * 2014-11-24 2016-05-26 Kyocera Document Solutions Inc. Image Forming Apparatus with User Identification Capabilities

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000029585A (en) * 1998-07-08 2000-01-28 Canon Inc Voice command recognizing image processor
FR2788615B1 (en) * 1999-01-18 2001-02-16 Thomson Multimedia Sa Apparatus comprising a voice or manual user interface and process for aiding the learning of voice commands such an apparatus
JP2003169171A (en) * 2001-11-30 2003-06-13 Canon Inc Image forming device
JP2006005695A (en) * 2004-06-18 2006-01-05 Nec Corp Portable terminal
JP4548093B2 (en) * 2004-11-01 2010-09-22 日本電気株式会社 Display data editing method for a portable terminal device, and a mobile terminal device
JP2008096541A (en) * 2006-10-06 2008-04-24 Canon Inc Speech processing device and control method therefor
JP2008210056A (en) * 2007-02-23 2008-09-11 Matsushita Electric Works Ltd Load control unit
JP2010049432A (en) * 2008-08-20 2010-03-04 Konica Minolta Business Technologies Inc Display screen control device and method thereof, and information processor

Also Published As

Publication number Publication date
JP2010224890A (en) 2010-10-07

Similar Documents

Publication Publication Date Title
US8229947B2 (en) Image processing apparatus and method for controlling image processing apparatus
CN101005551B (en) The method of starting an image processing apparatus and an image processing device
EP2122539B1 (en) Translation and display of text in picture
US6580838B2 (en) Virtual zero task time speech and voice recognition multifunctioning device
JP4371965B2 (en) Image processing apparatus, image processing method
CN100539623C (en) Service environment setting system, electronic apparatus, wireless communication terminal, programme
JP4405831B2 (en) An image processing apparatus and a control method thereof, a program
US6975993B1 (en) System, a server for a system and a machine for use in a system
CN102314326B (en) Data processing equipment
US6704119B1 (en) File system and storage medium storing program used in such system
US20060181520A1 (en) Information input device, information input method, and information input program
CN1495661A (en) Information search started by scanned image medium
JP2006350551A (en) Document conversion device, document conversion method, document conversion system, document processor and information processor
US20070176946A1 (en) Image forming apparatus, control method thereof, and program for implementing the method
EP1463289A2 (en) Data processing device
JP2008210383A (en) Automatic job template generator and automatic job template generation method
US20070133054A1 (en) Storage medium for managing job log, job log management method, image processing apparatus, and image processing system
JP4066691B2 (en) Print control apparatus and program
JP4878471B2 (en) Information processing apparatus and control method thereof
EP1491993A2 (en) Recording apparatus and recording control method for executing recording according to setting of print parameters
KR101179370B1 (en) Information processing apparatus, method of controlling the same, and storage medium
JP2006003568A (en) Image forming apparatus, image forming method, program for making computer execute the method, image processing system and image processing apparatus
JP2008123298A (en) Information processing method and system
US7265863B2 (en) Image forming system, image forming device, function setting method and storage medium
CN101370067B (en) Image forming apparatus, display processing apparatus, display processing method

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20110905

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20120809

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120925

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20121121

A711 Notification of change in applicant

Free format text: JAPANESE INTERMEDIATE CODE: A712

Effective date: 20130417

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20130716

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20130729

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150