A Day in the Life

A day in my life. Thoughts on leadership, management, startups, technology, software, concurrent development, etc... Basically the stuff I think about from 10am to 6pm.

3/10/2006

Digipede: Distributing Excel Computations - Excel Automation Considerations

Excel assumes a GUI so automation (which is GUI-less) can be a little tricky. I'm going to take you through a few areas of consideration in the automation process.

Microsoft has a document available called "Considerations for server-side Automation of Office" which explains in detail how Microsoft does not recommend running automated Excel objects on remote machines. Please take the time to read the Microsoft document because it's important for you to understand the technological limitations of Excel. Within the Microsoft document is a section called Problems using Automation of Office server-side, I will address each of the issues and how you can work around them on the Digipede Network. I've also encountered other automation problems and I've broken all the issues up into three main groups: Administration Issues, Worker Workbook Issues, and Debugging Tips and Tricks.

Administration Issues

User Identity
Excel is designed to run in an interactive manner. So Excel makes assumptions as to what directories and security configurations exist when it is launched. Running Excel as an automated COM server does not guarantee that the needed settings are there. You will have to configure each compute node to satisfy Excel's needs. We have identified two areas:
  1. Excel requires a valid user identity because it needs the associated profile. When a domain user logs into a machine the operating system automatically creates directories and settings that Excel uses. To solve this problem on the Digipede Network, install each Digipede Agent with a valid user identity. During the Digipede Agent installation on the 'Logon Information' page select 'Specific User' and enter a valid user login for the machine. Make sure that the user identity has logged into the compute node at some time so that the default profile information exists.

  2. To launch an automated Excel instance from the Digipede Agent you will need to make sure that the specified user identity has the correct COM security settings. Follow these steps:

    1. Open the 'Component Services' from within the 'Administrative Tools' or from the command-line by typing dcomcnfg.exe.

    2. Expand the 'Component Services' branch so that you can see 'My Computer'.

    3. Right-click on 'My Computer' and select 'Properties'.

    4. On the 'My Computer Properties' dialog, select the 'COM Security' page.

    5. In the 'Launch and Activation Permissions' group box, select the 'Edit Default' button.

    6. If your user id is not listed in the 'Group or user names' list control, select 'Add' and add it.

    7. With your user id selected in 'Group or user names', select the checkbox 'Local Launch' and 'Local Activation'.

    8. Select 'OK', select 'OK', and then close 'Component Services'.


Resiliency and Stability
During installation Microsoft Office products have the option of performing "install on first use" for some components. If an uninstalled component is needed by an Office instance an installation wizard for the missing component is started. This causes a problem for any automated server-side Office instances because now there is a UI that can't be dealt with programmatically. Developers will need to work with their IT departments to control this problem.

If a Digipede job seems to hang, I recommend running the Worker workbook on the compute node to make sure that this problem doesn't exist. If it does exist get the required components installed on the compute node. You will see similar behavior if you're Worker workbook requires an add-in. Make sure that everything the Worker workbook needs is installed on the compute node.

If your Excel distribution pattern creates an Excel object, it is recommended that all the compute nodes in your target pool run the same version of Excel.

Server-Side Security
There are several levels to security. The Digipede Network will not accept a work request that has not properly logged in. This ensures that only users who have been given access to the Digipede Network can request work on it.

Code level security is the responsibility of the developers. Each developer must ensure that their code is secure and doesn't create problems on the compute node or network.

You will need to configure Excel so that the Macro security level is low, to avoid the security dialog.

Worker Workbook Issues

NOTHING that is written to run on the compute nodes can have a GUI. Not Excel, not the run script, not your code. Dialogs will lock the compute node up while the software waits for a user to hit a button.

Interactivity with the Desktop
Excel was designed around the user interface so you have to programmatically disable all UI functionality. The Microsoft article "How To Dismiss a Dialog Box Displayed by an Office Application with Visual Basic" discusses various messages that may come up and how to disable them. (If you are distributing a Worker DLL, it is also important for the developer to remember to not put message boxes into the code contained in DoWork().)

Sample Visual Basic 6 code to disable Excel GUI:
    Dim myExcelApp As Excel.Application

myExcelApp.DisplayAlerts = False
myExcelApp.AskToUpdateLinks = False
myExcelApp.AlertBeforeOverwriting = False
myExcelApp.FeatureInstall = msoFeatureInstallNone

Disabling display of message boxes will avoid any user interaction problems. To be courteous to the user of the desktop where the Excel work is being done, it is recommended that you first save the current settings, disable everything, and then reset the values on exit.
Update: the above statement has been found to be wrong because each instance of Excel running as a COM server is running in it's own process space and as a result Application method calls do NOT affect other instances. The problems with Application method calls affecting other workbooks can happen when opening multiple workbooks in the same Excel instance.

Reentrancy and Scalability
Excel is an application that can be slow to close down and if you haven't released all the objects it doesn't close down at all. It's not enough to make a call to Application.Quit() you must also make sure to release all Excel objects. And if you are using managed code (ie. .NET) then you will also need to perform garbage collection...twice.

Sample C# ShutDown() method:
private void ShutDownExcel() {

if (mExcelApp != null) {

mExcelApp.DisplayAlerts = true;
mExcelApp.Quit();

System.Runtime.InteropServices.Marshal.ReleaseComObject(mExcelApp);
mExcelApp = null;
}

// Clean up memory so Excel can shut down.
GC.Collect();
GC.WaitForPendingFinalizers();

// The GC needs to be called twice in order to get the
// Finalizers called - the first time in, it simply makes
// a list of what is to be finalized, the second time in,
// it actually the finalizing. Only then will the
// object do its automatic ReleaseComObject.
GC.Collect();
GC.WaitForPendingFinalizers();
}

If you are not cleaning up the Excel objects by releasing them and performing garbage collection, multiple instances of Excel will be left open. This is visible by using Task Manager and sorting by Image Name. Excel does not support an unlimited number of active Excel instances so the compute node will eventually be unable to respond to Excel requests. As the developer you MUST confirm that your Worker workbook closes properly and that Excel is exiting cleanly. More information can be found in the Microsoft MSDN article "Office application does not quite after automation from Visual Studio .NET client".

One other factor to consider is that it is possible to have an active user running Excel. You must be careful not to affect the user's work.

  • Do not make any Application level Calculate() calls as this may cause calculations to start on the active user’s workbook.

  • Do not shut down the user's instance of Excel, so be very careful about calling Application.Quit().


  • Update: the above statement has been found to be wrong.

    Strange Excel Lessons
    Make sure to use fully qualified paths when referencing anything in Excel. Excel apparently doesn't use the current directory as the working directory so you need to set the Application.DefaultPath to control where files are saved. You will also need to open any files and run any macros by using a fully qualified path, otherwise Excel can't find them. It is safest to use fully qualified paths whenever referencing any Excel objects.

    Always control the Worker workbook through a script or an assembly. Trying to launch Excel directly makes it VERY hard to shutdown when it's finished. It's just easier to debug and to control using a launch mechanism. Make sure that the script engine is non-gui. (So use cscript instead of wscript.)

    Debugging Tips And Tricks

    Make sure that Excel is closing down properly on the compute node by looking at the Task Manager process list. When Excel is run automated it launches as not visible by default. This means you need to look on the Processes page to determine if you are orphaning an Excel process. You can quickly find the Excel process by sorting on the Image Name. You should kill all orphaned Excel processes.

    I used a few different techniques in the VBS script to determine what was going on:

    1. Create a log file and write to it to determine what is happening in the automated session.
      Dim fso, f1
      Dim logFileName
      logFileName = "mylog.txt"
      Set fso = CreateObject("Scripting.FileSystemObject")
      Set f1 = fso.OpenTextFile(logFileName , 8, True)
      f1.WriteLine("MyScript: Start")
      f1.Close

    2. If you create a results file you make want to save it before you exit the script, this lets you see exactly what is being generated by the Worker workbook before the Digipede Network moves it.
      ' Make a copy of the results file
      Set objFSO = CreateObject("Scripting.FileSystemObject")
      Dim strResultsFile
      strResultsFile = strPath & "\Results" & TaskId & ".xls"
      strSourceFile = strPath & "\Results.xls"
      objFSO.CopyFile strSourceFile , strResultsFile

    3. Write your code to catch errors, don't let them get passed out. You can then update the log file with the error code.
      Dim strMacroName
      strMacroName = "'" & strPath & "\Worker" & "!Sheet1.StartCmdBtn_Click"
      on error resume next
      myExcelWorker.Run strMacroName
      if err.number <> 0 Then
      ' Error occurred - just close it down.
      End If
      err.clear
      on error goto 0

    I also created a log file for the Worker workbook aid in debugging.

    Public Sub LogMessage(ByVal strMessage As String)

    '------------------------------------------------
    ' open existing file for appending,
    ' if the file does not exist a new one is created
    '-------------------------------------------------
    Dim strFilePath As String
    strFilePath = Application.DefaultFilePath & "\mylog.txt"

    Dim fileNumber As Integer
    fileNumber = FreeFile()

    Open strFilePath For Append As #fileNumber ' Create file name.
    Write #fileNumber, strMessage ' Output text.
    Close #fileNumber ' Close file.
    End Sub

    Conclusion

    Make sure that the run script and Worker workbook actually run on a compute node before trying an all out distribution test. Run it once. Debug it until it runs, then expand. Remember Henry David Thoreau's important contribution to the software industry, "Simplify, simplify, simplify."

    Essay Links:

    Start
    Automation
    Part 1
    Part 2
    Part 3
    Part 4
    Part 5

    Labels:

    2 Comments:

    At July 12, 2007 9:58 AM, Blogger Unknown said...

    I'm trying to disable a message box on opening an excel from a vbs program. I have set the Application.DisplayAlerts=False. But still I either get the message box or nothing is read from the excel. Please help

     
    At July 12, 2007 12:30 PM, Blogger Kim Greenlee said...

    Lakshmi,

    I can't really recommend a solution without the error message. Follow this link for more information about this type of problem. If you've got the DisplayAlerts flag set to False then it is likely the message indicates some configuration issue that you must resolve.

     

    Post a Comment

    << Home