AspectShell User Guide
1 Overview
1.1 Supported Environments
AspectShell has been tested in the following environments:
- IA-32, Red Hat 7.3, 9.0, Ubuntu Hardy, CentOS-4
- IA-64, SuSE 9.0
- Alpha, Tru64
AspectShell has not been tested in other environments.
1.2 Architecture
AspectShell is implemented with a software agent plug-in architecture. A
submission agent is invoked to execute a command when it is specified with the
keyword placement constructs “on”, “in”, or “with”. Also, a data transfer agent
is invoked when the redirection operators “>” or “<” is specified with a
URL, or with a task identifier task_<id>,
task_all, or task_any. The software components
architecture of AspectShell is shown in figure 1.
AspectShell augments the implementation of the “command parser”, “command executer” and “stdin/stdout redirector” components in TCSH, with the “submission agent” and “data agent” components. The command executor will invoke the submission agent executable when the command parser detects the placement clauses “in”, “on” or “where”. This causes the executable to be passed as an argument to a software executable defined in the installation of AspectShell. Also, when a shell command output or input is redirected to a “gsiftp” URL scheme or to a gridshell task identifier, the data agent is invoked. The data agent is then responsible for opening a TCP socket descriptor to the remote location, and passing this descriptor back to the shell environment for data to be redirected to.
2 User Guide
2.1 Submitting a job
A command will be scheduled on a distributed resource when the user specifies
the placement construct “on <host name>” anywhere in its
command line. For example, a user may wish to list the contents of the
/tmp directory on a remote host as shown,
%> ls -l /tmp on host1
By default jobs are run remotely in interactive mode, using a remote execution facility like globus-job-run or ssh. In most high performance clusters however, jobs need to be submitted to a batch queuing system. Submission to the underlying batch queuing system is also supported in AspectShell. In order to enable this, the user needs to tell AspectShell that it is in “batch” mode:
%> setaspect -batch on aspect mode: batch is on
Subsequent commands to AspectShell with the appropriate submission constructs described later in the document will be submitted to the underlying batch queuing system. Batch mode can also be turned on by defining the _ASPECT_BATCH_MODE environment variable as well. To turn “batch” mode off, the user will needs to type:
%> setaspect -batch off aspect mode: batch is off
2.2 Specifying resource requirements
A user may also give the underlying scheduler an explicit resource
requirement when selecting a host to execute a command. This resource
requirement can be specified with the “with” keyword. The string
following this keyword is treated as the resource specification for the command,
and is propagated to the an underlying resource broker for host selection. For
example, a NASTRAN simulation, that needs to be executed on a SPARC host with
more the 25M of free memory can be run as shown,
%> nastran with “mem > 25 && type == SPARC”
In AspectShell, the command will then be dispatched on a host that meets the specified resource requirement.
2.3 Parametric sweep submissions
Job group submissions can be declared with the “in <number>
instances” placement construct. This causes the number of instances
specified, collectively called a group, to be submitted to the underlying
distributed resource management system. The example below declares that 20000
instances of the command cmkin.exe should be submitted to the cluster resource
managed environment:
%> cmkin.exe in 20000 instances
Production batch systems however often restrict the number of jobs that can be submitted to its queues, preventing a single user from starving out other users of its compute resources. AspectShell will therefore throttle the submission of the executable, by ensuring that the local policy limit of queued jobs per user is not overrun. The limit is specified when the administrator first installs the submission agent on site with the _ASPECT_THROTTLE environment variable.
2.4 Transferring data
The shell also overloads the “<” and “>” I/O operators to allow remote files to be accessed by the shell. To enable this feature, the user needs to enable “io” mode by typeing:
%> setaspect -io on
To turn this feature off, the user equivalently types:
%> setaspect -io offOur implementation allows remote files to be specified with a GSIFTP URL for redirecting to the standard input and output of a command. For example, a user can specify that the standard output of a command be redirected to a file residing on a local directory on a remote host,
%> echo “Hello World” > gsiftp://compute-9-2/tmp/test.txt
Similarly, a user can specify that the content of a file located on a remote host be redirected to the standard input of a command,
%> cat < gsiftp://compute-9-2/tmp/test.txt
2.5 Writing a parallel script
A command may also be executed in parallel by the shell when the user
specifies the placement construct “on <number> procs” as an
argument anywhere within the command line. The user may further use the “with”
keyword, in conjunction with this, to indicate a resource requirements for
selecting these hosts for parallel execution. An example would be if the command
needs to be executed in parallel on three compute nodes, each with at least 25M
of memory available,
%> /bin/hostname with “mem > 25” on 3 procs
Commands executed in parallel by the shell will also have the environment
variables _ASPECT_TASK_NUM and _ASPECT_TASKID set in its
environment. The former variable indicates the total number of tasks involved in
this parallel execution and the later the tasks unique rank respectively. This
is useful in Single Program Multiple Data (SPMD) type parallel execution, or for
a master-slave type parallel execution pattern, where a task's rank determines
the data, or the role it takes in the parallel execution.
Commands executed in parallel by AspectShell may also be shell scripts
themselves. If this is the case, these scripts are able to communicate with each
other by using the overloaded I/O redirect operators “>” and “<”. The
scripts specify the task with which it wishes to communicate with by specifying
the key word task_<task number>.
For example, tasks with a rank greater then “0” may communicate the end of its computation to task rank “0” for synchronization,
if ( $_ASPECT_TASKID > 0 ) then
echo “I am finished” > task_0
endif
Similarly, task “0” may wait on a blocking receive for all the other tasks in the parallel execution of the script before finally reporting its completion.
@ n = 1
while ( $n < $_ASPECT_TASK_NUM )
ack=`cat < task_$n`
@ n = $n + 1
Done
echo “Computation complete!”
We also introduce syntax to allow a task to broadcast data to all
participating tasks in a parallel execution of a command. A task may do this by
specifying the task_all keyword when redirecting output,
echo “$initial_data” > task_all
Also a task may be specified to wait on input data from any other task in the parallel execution of a command, by specifying the task_any keyword when redirecting input,
set response = `cat < task_any`
3 Administration
3.1 Installing AspectShell
[ewalker@lela ~]$ zcat aspect-tcsh-version.tar.gz | tar xvf - [ewalker@lela ~]$ cd aspect-tcsh-version [ewalker@lela ~/aspect-tcsh-version]$ ./configure [ewalker@lela ~/aspect-tcsh-version]$ make; make install
The default installation will install all aspect-tcsh binaries in
$HOME/aspectshell/bin and create a symlink from
$HOME/.aspectshell to this installation directory.
However, you can change the path for the installation binaries by
specifying a different path with --prefix when you invoke configure.
Note that you will then need to manually create a symlink from
$HOME/.aspectshell to your new installation directory.
E.g. if you wish to install in /opt/aspectshell:
[ewalker@lela ~/aspect-tcsh-version]$ ./configure --prefix=/opt/aspectshell [ewalker@lela ~/aspect-tcsh-version]$ make; make install [ewalker@lela ~/aspect-tcsh-version]$ ln -s /opt/aspectshell/bin $HOME/.aspectshell
3.2 Configuring your environment
Optional configuration environment variables:
ASPECTSHELL_LOCATION - Identifies an alternative AspectShell installation location.
Default == $HOME/.aspectshell.
_ASPECT_SUBMIT_AGENT - Identifies the agent executable that
will execute a command invoked with an AspectShell placement construct.
The agent is responsible for mapping the command execution
to the appropriate underlying execution infrastructure. The environment
variable can specify the agent executable command line with the
meta-arguments %E, %U, and %R.
These meta-arguments will expand to the
executable command string, invoking username, and resource requirement string
respectively. If no resource is specified by the user, the %R value will be expanded to the string "default".
Default == $HOME/.aspectshell/aspect-submit-agent %E
_ASPECT_DATA_AGENT - Identifies the agent executable that
will perform the remote data transfer
when the overloaded shell “>” and “<” redirection
operators are invoked.
The location of the remote data source will be passed to the data agent which
is responsible for opening a channel to the location with the appropriate read or write mode,
and passing the channel file descriptor back to the shell internals for redirection.
Default == $HOME/.aspectshell/aspect-data-agent
_ASPECT_THROTTLE - Limits the number of jobs that are
submitted to the local batch queuing system. This is used for job ensemble submissions.
Default == no throttle.
_ASPECT_DEBUG - Turns on verbose debugging.
_ASPECT_INSTANCE_NUM -
Automatically defined by the
shell when the “in <num> instances” syntax extension in a job ensemble submission is invoked.
This informs the submit agents how many instances of the command needs to be executed.
The agent is responsible for performing the ensemble submission.
Default == 1.
_ASPECT_TASK_NUM - Automatically defined by the shell
when the “on <num> procs” syntax extension in a parallel job execution is invoked.
This informs the submit agent how many parallel tasks needs to be spawned.
Default == 1.
_ASPECT_RESOURCE - Automatically defined by the shell
when the “with
_ASPECT_BATCH_MODE -
Turns on the batch job submission semantics.
This causes the entire command line (including the aspect extension) to be passed to the submit agent for addition into a batch script file.
This mode is also automatically defined when the uses specifies setaspect -batch on at the command line.
Default == none.
_ASPECT_NETWORK_DEVICE - Identifies the network interface to which the aspect-comm-agent will bind-listen too. The aspect-comm-agent implements the shell message-passing feature.
Default == hostname.
3.3 Run-time Modes
Run-time modes can be turned on/off with the setaspect builtin command
ewalker:~/mycluster-v2/aspect-tcsh> builtins | grep setaspect repeat sched set setaspect setenv settc setty ewalker:~/mycluster-v2/aspect-tcsh> setaspect -v aspect mode: verbose off aspect mode: debug is off aspect mode: I/O is off aspect mode: batch is off ewalker:~/mycluster-v2/aspect-tcsh> setaspect -io on
Definition of modes:
- verbose - verbose output for
setaspectbuiltin command - debug - debugging information
- I/O - redirection operator overloading
- batch - causes the entire command line (including the aspect extension) to be passed to the submit agent for processing


