======Middleware Access Drivers 5.10====== ===== Information manager MAD ===== In order to provide an abstraction with the monitoring and discovery middleware layer (or Grid Information System), GridWay uses a Middleware Access Driver (MAD) module to discover and monitor hosts. This module provides basic operations with the monitoring and discovery middleware. The format to send a request to the Information MAD, through its standard input, is:OPERATION HID HOST ARGS Where: * OPERATION: Can be one of the following: * INIT: Initializes the MAD (i.e. ''INIT - - -''). * DISCOVER: Discovers hosts (i.e. ''DISCOVER - - -'' ). * MONITOR: Monitors a host (i.e. ''MONITOR HID HOST -''). * FINALIZE: Finalizes the MAD (i.e. ''FINALIZE - - -''). * HID if the operation is MONITOR, it is a host identifier, chosen by GridWay. Otherwise it is ignored. * HOST: If the operation is MONITOR it specifies the host to monitor. Otherwise it is ignored. On the other side, the format to receive a response from the MAD, through its standard output, is:OPERATION HID RESULT INFO Where: * OPERATION: Is the operation specified in the request that originated the response. * HID: It is the host identifier, as provided in the submission request. * RESULT: It is the result of the operation. Could be SUCCESS or FAILURE. * INFO: If RESULT is FAILURE, it contains the cause of failure. Otherwise, if OPERATION is DISCOVER, it contains a list of discovered host, or if OPERATION is MONITOR, it contains a list of host attributes. **Table 2. Attributes that should be defined by the Information MADs.** ^Attribute^Description^ | HOSTNAME | FQDN (Fully Qualified Domain Name) of the execution host (e.g. "hydrus.dacya.ucm.es") | | ARCH | Architecture of the execution host (e.g. "i686", "alpha") | | OS_NAME | Operating System name of the execution host (e.g. "Linux", "SL") | | OS_VERSION | Operating System version of the execution host (e.g. "2.6.9-1.66", "3") | | CPU_MODEL | CPU model of the execution host (e.g. "Intel(R) Pentium(R) 4 CPU 2", "PIV") | | CPU_MHZ | CPU speed in MHz of the execution host | | CPU_FREE | Percentage of free CPU of the execution host | | CPU_SMP | CPU SMP size of the execution host | | NODECOUNT | Total number of nodes of the execution host | | SIZE_MEM_MB | Total memory size in MB of the execution host | | FREE_MEM_MB | Free memory in MB of the execution hosts | | SIZE_DISK_MB | Total disk space in MB of the execution hosts | | FREE_DISK_MB | Free disk space in MB of the execution hosts | | LRMS_NAME | Name of local DRM system (job manager) for execution, usually not fork (e.g. "jobmanager-pbs", "Pbs", "jobmanager-sge", "SGE") | | LRMS_TYPE | Type of local DRM system for execution (e.g. "PBS", "SGE") | | QUEUE_NAME[i] | Name of queue i (e.g. "default", "short", "dteam") | | QUEUE_NODECOUNT[i] | Total node count of queue i | | QUEUE_FREENODECOUNT[i] | Free node count of queue i | | QUEUE_MAXTIME[i] | Maximum wall time of jobs in queue i | | QUEUE_MAXCPUTIME[i] | Maximum CPU time of jobs in queue i | | QUEUE_MAXCOUNT[i] | Maximum count of jobs that can be submitted in one request to queue i | | QUEUE_MAXRUNNINGJOBS[i] | Maximum number of running jobs in queue i | | QUEUE_MAXJOBSINQUEUE[i] | Maximum number of queued jobs in queue i | | QUEUE_DISPATCHTYPE[i] | Dispatch type of queue i (e.g. "batch", "inmediate") | | QUEUE_PRIORITY[i] | Priority of queue i | | QUEUE_STATUS[i] | Status of queue i (e.g. "active", "production") | \\ The information drivers interface to the grid information services to collect the resource attributes. These attributes can be used by the end-user to set requirement and rank expressions (job template), for filtering, prioritizing and selecting the candidate hosts. GridWay can simultaneously use as many Information drivers as needed. For example, GridWay allows you to simultaneously use MDS2 and MDS4 services, so you can also use resources from different Grids at the same time. Drivers for MDS 2 and MDS 4 provide the variables described in Table 2-1. However, the information manager is able to receive from the driver other parameters. The GridWay team has used other information parameters that could be very important to improve application efficiency (HTC apps) and for job migration: BANDWIDTH, LATENCY, SPEC_INT, SPEC_FLOAT... ==== Using them by hand === You can start a mad by hand, here the example for the LDAP mad used on server ''gilda-bdii.ct.infn.it'' restricted to the ''Production'' hosts from the ''gilda'' Virtual Organization. ''$'' is the prompt, ''<'' the line you should type in and ''>'' the answer from the MAD. $ gw_im_mad_bdii -s gilda-bdii.ct.infn.it -q "(GlueCEStateStatus=Production)(GlueCEAccessControlBaseRule=VO:gilda)" < INIT - - - > INIT - SUCCESS - < DISCOVER - - - > DISCOVER - SUCCESS ce1-egee.srce.hr gilda-01.pd.infn.it dgt01.ui.savba.sk vega-ce.ct.infn.it grid010.ct.infn.it ce.hpc.iit.bme.hu iceage-ce-01.ct.infn.it ce-edu.grid.acad.bg sirius-ce.ct.infn.it dc01.nesc.ed.ac.uk gn0.hpcc.sztaki.hu < MONITOR - gn0.hpcc.sztaki.hu - > MONITOR - SUCCESS HOSTNAME="gn0.hpcc.sztaki.hu" ARCH="i686" NODECOUNT=16 LRMS_NAME="jobmanager-lcgpbs" LRMS_TYPE="torque" OS_NAME="ScientificSL" OS_VERSION="Beryllium" CPU_MODEL="PentiumD" CPU_MHZ=3000 CPU_SMP=2 FREE_MEM_MB=1024 SIZE_MEM_MB=1024 QUEUE_NAME[0]="gilda" QUEUE_NODECOUNT[0]=16 QUEUE_FREENODECOUNT[0]=16 QUEUE_MAXTIME[0]=4320 QUEUE_MAXCPUTIME[0]=2880 QUEUE_MAXJOBSINQUEUE[0]=999999999 QUEUE_MAXRUNNINGJOBS[0]=999999999 QUEUE_STATUS[0]="Production" QUEUE_DISPATCHTYPE[0]="batch" QUEUE_PRIORITY[0]="1" QUEUE_JOBWAIT[0]="0" QUEUE_ACCESS[0]=":gilda:" < FINALIZE - - - > FINALIZE - SUCCESS - ===== Execution manager MAD ===== In order to provide an abstraction with the resource management middleware layer, GridWay uses a Middleware Access Driver (MAD) module to submit, control and monitor the execution of jobs. This module provides basic operations with the resource management middleware. The format to send a request to the Execution MAD, through its standard input, is:OPERATION JID HOST/JM RSL Where: * OPERATION: Can be one of the following: * INIT: Initializes the MAD. * SUBMIT: Submits a job. * POLL: Polls a job to obtain its state. * CANCEL: Cancels a job. * FINALIZE: Finalizes the MAD. * JID: Is a job identifier, chosen by GridWay. * HOST: If the operation is SUBMIT, it specifies the resource contact to submit the job. Otherwise it is ignored. * JM: If the operation is SUBMIT, it specifies the job manager to submit the job. Otherwise it is ignored. * RSL: If the operation is SUBMIT, it specifies the path to the file with the job description. Otherwise it is ignored. On the other side, the format to receive a response from the MAD, through its standard output, is:OPERATION JID RESULT INFO Where: * OPERATION: Is the operation specified in the request that originated the response or CALLBACK, in the case of an asynchronous notification of a state change. * JID: It is the job identifier, as provided in the submission request. * RESULT: It is the result of the operation. Could be SUCCESS or FAILURE. * INFO: If RESULT is FAILURE, it contains the cause of failure. Otherwise, if OPERATION is POLL or CALLBACK, it contains the state of the job. ==== Using them by hand ==== In this example we opened on another terminal a globus-gass-server at host glite-ui.dacya.ucm.es, port 34069,have at ''/tmp/sleep.rsl'' a test file with the appropriate rsl description the executable. &(executable="/bin/sleep")(arguments="50")(stdout="https://glite-ui.dacya.ucm.es:34069//tmp/sleep.out")(stderr="https://glite-ui.dacya.ucm.es:34069//tmp/sleep.err")(environment=(GW_HOSTNAME "gilda-ce.rediris.es")(GW_USER "gwuser")(GW_JOB_ID 1)(GW_TASK_ID 0)(GW_ARRAY_ID -1)(GW_TOTAL_TASKS 0)(GW_RESTARTED 0))(queue="gilda") . Also we have valid credentials for submitting into ''lcgce0.shef.ac.uk/jobmanager-lcgpbs'': $ gw_em_mad_gram2 < INIT - - - > INIT - SUCCESS - < SUBMIT 1 gilda-ce.rediris.es/jobmanager-lcgpbs /tmp/job.rsl > SUBMIT 1 SUCCESS https://gilda-ce.rediris.es:20008/22945/1266605386/ (some time after) < POLL 1 - - > POLL 1 SUCCESS PENDING < POLL 1 - - > POLL 1 SUCCESS ACTIVE (50 seconds later) > TIMER - SUCCESS Credential is valid until Fri Oct 16 00:15:36 2009 < FINALIZE - - - > FINALIZE - SUCCESS - ===== Transfer manager MAD ===== In order to provide an abstraction with the file transfer management middleware layer, GridWay uses a Middleware Access Driver (MAD) module to transfer job files. This module provides basic operations with the file transfer middleware. The format to send a request to the Transfer MAD, through its standard input, is:OPERATION JID TID EXE_MODE SRC_URL DST_URLWhere: * OPERATION: Can be one of the following: * INIT: Initializes the MAD, JID should be max number of jobs. * START: Init transfer associated with job JID * END: Finish transfer associated with job JID * MKDIR: Creates directory SRC_URL * RMDIR: Removes directory SRC_URL * CP: start a copy of SRC_URL to DST_URL, with identification TID, and associated with job JID. * FINALIZE: Finalizes the MAD. * JID: Is a job identifier, chosen by GridWay. * TID: Transfer identifier, only relevant for command CP. * EXE_MODE: If equal to 'X' file will be given execution permissions, only relevant for command CP. On the other side, the format to receive a response from the MAD, through its standard output, is:OPERATION JID TID RESULT INFO Where: * OPERATION: Is the operation specified in the request that originated the response or CALLBACK, in the case of an asynchronous notification of a state change. * JID: It is the job identifier, as provided in the START request. * TID: It is the transfer identifier, as provided in the CP request. * RESULT: It is the result of the operation. Could be SUCCESS or FAILURE. * INFO: If RESULT is FAILURE, it contains the cause of failure.