Table of Contents

Middleware Access Drivers 5.6

Information manager MAD

In order to provide an abstraction with the monitoring and discovery middleware layer (or Grid Information System), GridWay uses a Middleware Access Driver (MAD) module to discover and monitor hosts. This module provides basic operations with the monitoring and discovery middleware.

The format to send a request to the Information MAD, through its standard input, is:

OPERATION HID HOST ARGS

Where:

On the other side, the format to receive a response from the MAD, through its standard output, is:

OPERATION HID RESULT INFO

Where:

Table 2. Attributes that should be defined by the Information MADs.

AttributeDescription
HOSTNAME FQDN (Fully Qualified Domain Name) of the execution host (e.g. “hydrus.dacya.ucm.es”)
ARCH Architecture of the execution host (e.g. “i686”, “alpha”)
OS_NAME Operating System name of the execution host (e.g. “Linux”, “SL”)
OS_VERSION Operating System version of the execution host (e.g. “2.6.9-1.66”, “3”)
CPU_MODEL CPU model of the execution host (e.g. “Intel(R) Pentium(R) 4 CPU 2”, “PIV”)
CPU_MHZ CPU speed in MHz of the execution host
CPU_FREE Percentage of free CPU of the execution host
CPU_SMP CPU SMP size of the execution host
NODECOUNT Total number of nodes of the execution host
SIZE_MEM_MB Total memory size in MB of the execution host
FREE_MEM_MB Free memory in MB of the execution hosts
SIZE_DISK_MB Total disk space in MB of the execution hosts
FREE_DISK_MB Free disk space in MB of the execution hosts
LRMS_NAME Name of local DRM system (job manager) for execution, usually not fork (e.g. “jobmanager-pbs”, “Pbs”, “jobmanager-sge”, “SGE”)
LRMS_TYPE Type of local DRM system for execution (e.g. “PBS”, “SGE”)
QUEUE_NAME[i] Name of queue i (e.g. “default”, “short”, “dteam”)
QUEUE_NODECOUNT[i] Total node count of queue i
QUEUE_FREENODECOUNT[i] Free node count of queue i
QUEUE_MAXTIME[i] Maximum wall time of jobs in queue i
QUEUE_MAXCPUTIME[i] Maximum CPU time of jobs in queue i
QUEUE_MAXCOUNT[i] Maximum count of jobs that can be submitted in one request to queue i
QUEUE_MAXRUNNINGJOBS[i] Maximum number of running jobs in queue i
QUEUE_MAXJOBSINQUEUE[i] Maximum number of queued jobs in queue i
QUEUE_DISPATCHTYPE[i] Dispatch type of queue i (e.g. “batch”, “inmediate”)
QUEUE_PRIORITY[i] Priority of queue i
QUEUE_STATUS[i] Status of queue i (e.g. “active”, “production”)


The information drivers interface to the grid information services to collect the resource attributes. These attributes can be used by the end-user to set requirement and rank expressions (job template), for filtering, prioritizing and selecting the candidate hosts. GridWay can simultaneously use as many Information drivers as needed. For example, GridWay allows you to simultaneously use MDS2 and MDS4 services, so you can also use resources from different Grids at the same time. Drivers for MDS 2 and MDS 4 provide the variables described in Table 2-1. However, the information manager is able to receive from the driver other parameters. The GridWay team has used other information parameters that could be very important to improve application efficiency (HTC apps) and for job migration: BANDWIDTH, LATENCY, SPEC_INT, SPEC_FLOAT…

Using them by hand

You can start a mad by hand, here the example for the LDAP mad used on server gilda-bdii.ct.infn.it restricted to the Production hosts from the gilda Virtual Organization. $ is the prompt, < the line you should type in and > the answer from the MAD.

$ gw_im_mad_egee_ldap -s gilda-bdii.ct.infn.it -q "(GlueCEStateStatus=Production)(GlueCEAccessControlBaseRule=VO:gilda)"
< INIT - - -
> INIT - SUCCESS -
< DISCOVER - - -
> DISCOVER - SUCCESS ce1-egee.srce.hr gilda-01.pd.infn.it dgt01.ui.savba.sk vega-ce.ct.infn.it grid010.ct.infn.it ce.hpc.iit.bme.hu iceage-ce-01.ct.infn.it ce-edu.grid.acad.bg sirius-ce.ct.infn.it dc01.nesc.ed.ac.uk gn0.hpcc.sztaki.hu
< MONITOR - gn0.hpcc.sztaki.hu -
> MONITOR - SUCCESS HOSTNAME="gn0.hpcc.sztaki.hu" ARCH="i686" NODECOUNT=16 LRMS_NAME="jobmanager-lcgpbs" LRMS_TYPE="torque" OS_NAME="ScientificSL" OS_VERSION="Beryllium" CPU_MODEL="PentiumD" CPU_MHZ=3000 CPU_SMP=2 FREE_MEM_MB=1024 SIZE_MEM_MB=1024 QUEUE_NAME[0]="gilda" QUEUE_NODECOUNT[0]=16 QUEUE_FREENODECOUNT[0]=16 QUEUE_MAXTIME[0]=4320 QUEUE_MAXCPUTIME[0]=2880 QUEUE_MAXJOBSINQUEUE[0]=999999999 QUEUE_MAXRUNNINGJOBS[0]=999999999 QUEUE_STATUS[0]="Production" QUEUE_DISPATCHTYPE[0]="batch" QUEUE_PRIORITY[0]="1" QUEUE_JOBWAIT[0]="0" QUEUE_ACCESS[0]=":gilda:" 
< FINALIZE - - -
> FINALIZE - SUCCESS -

Execution manager MAD

In order to provide an abstraction with the resource management middleware layer, GridWay uses a Middleware Access Driver (MAD) module to submit, control and monitor the execution of jobs. This module provides basic operations with the resource management middleware.

The format to send a request to the Execution MAD, through its standard input, is:

OPERATION JID HOST/JM RSL

Where:

On the other side, the format to receive a response from the MAD, through its standard output, is:

OPERATION JID RESULT INFO

Where:

Using them by hand

In this example we opened on another terminal a globus-gass-server at host ui-egee.dacya.ucm.es, port 34069,have at /tmp/sleep.rsl a test file with the appropriate rsl description the executable.

&(executable="/bin/sleep")(arguments="50")(stdout="https://ui-egee.dacya.ucm.es:34069//tmp/sleep.out")(stderr="https://ui-egee.dacya.ucm.es:34069//tmp/sleep.err")(environment=(GW_HOSTNAME "gilda-ce.rediris.es")(GW_USER "gwuser")(GW_JOB_ID 1)(GW_TASK_ID 0)(GW_ARRAY_ID -1)(GW_TOTAL_TASKS 0)(GW_RESTARTED 0))(queue="gilda")

. Also we have valid credentials for submitting into lcgce0.shef.ac.uk/jobmanager-lcgpbs:

$ gw_em_mad_prews
< INIT - - -
> INIT - SUCCESS -
< SUBMIT 1 gilda-ce.rediris.es/jobmanager-lcgpbs /tmp/job.rsl
> SUBMIT 1 SUCCESS https://gilda-ce.rediris.es:20008/22945/1266605386/
 (some time after)
< POLL 1  - -
> POLL 1 SUCCESS PENDING
< POLL 1  - -
> POLL 1 SUCCESS ACTIVE
 (50 seconds later)
> TIMER - SUCCESS Credential is valid until Fri Oct 16 00:15:36 2009
< FINALIZE - - -
> FINALIZE - SUCCESS -

Transfer manager MAD

In order to provide an abstraction with the file transfer management middleware layer, GridWay uses a Middleware Access Driver (MAD) module to transfer job files. This module provides basic operations with the file transfer middleware.

The format to send a request to the Transfer MAD, through its standard input, is:

OPERATION JID TID EXE_MODE SRC_URL DST_URL

Where:

On the other side, the format to receive a response from the MAD, through its standard output, is:

OPERATION JID TID RESULT INFO

Where: