pMatlab Troubleshooting Checklist
On this page, you'll find a series of questions and answers that will help you resolve the more common problems with running pMatlab jobs on the SuperCloud system.
Job Launch Problems
In this section, you'll find a series of questions and answers that will help you resolve some of the more common pMatlab job launch problems.
Is your path set up to include the gridMatlab directories?
Launch MATLAB® and enter the path
command in the MATLAB® command
window. In the first few lines of the output, does the path include
these directories from your SuperCloud home directory?
tools/gridMatlab/src
tools/pMatlab/src
tools/pMatlab/MatlabMPI/src
matlab
If not, check your startup code (startup.m
, startup_local.m
and any other
startup code that is called from those files) for calls to clear all
and
remove those calls. Make sure your pMatlab script and code does not use clear all
either.
Did you check whether there are resources available for running your job?
From the MATLAB® command window, enter the LLfree command and verify there are enough nodes and cores available for the CPU type that you requested. You can also enter LLfree at the Linux prompt if you are connected to the login node. Remember that if you are using triples mode to launch your job, you will be using whole nodes, so make sure there are enough nodes available.
Does the code that you're trying to run reside on the SuperCloud?
In order to run a job on the SuperCloud system, every processor must have access to all functions/methods in the code you are executing and any data the program accesses. Since every node can access your SuperCloud home directory or group shared directories, placing your code and data in your home directory or any group shared directory makes them accessible to the entire system.
If your code is not on SuperCloud, copy your code and any files that are needed in order to run your job to somewhere in your SuperCloud home directory. See Accessing and Transferring Data and Files for instructions on how to copy your files to SuperCloud.
If your code is on SuperCloud, confirm that your MATLAB® current working directory is somewhere in
your SuperCloud home directory by running the pwd
command.
- Did you try running the Param_Sweep example in your SuperCloud home directory?
The Param_Sweep
example is located in $HOME/examples/Param_Sweep
.
No: For instructions on how to run the Param_Sweep
example, see the
Verifying Your pMatlab Setup page.
Yes, it worked: the problem is probably in your code somewhere. Look in your job's log files for errors or other clues. See the page on Finding Your pMatlab Output.
You've tried and confirmed everything here and you still can't launch a pMatlab job
If you've tried and confirmed all of the above items and still can't launch a pMatlab job, please let us know what you've tried, and also copy, paste and send any errors that you see to supercloud@mit.edu. Please attach any files that might be helpful in diagnosing the problem.
References
Here is a list of webpages that were mentioned here, plus some others that might be helpful:
- Verifying Your pMatlab Setup
- Troubleshooting pMatlab Job Problems
- Steps to sanity check and debug
- Handling common job errors
- Finding Your Output
- Triples Mode
- Accessing and Transferring Data and Files
- LLx Online Courses: Practical HPC course, "Distributed Applications" module
- pMatlab: Getting Started
- Launching pMatlab Jobs on the SuperCloud Systems
- Troubleshooting SuperCloud pMatlab Job Errors