How does KOALA work?

Phases

Five operations are performed to any job submitted to KOALA, which are placing its components, transferring its input files and the executable, claiming processors for its components, launching it and monitoring its execution and transferring its output files back to the user. These operations, which form four phases that the job undergoes are shown in the figure above. In phase 1 , the components of a job are tried to be placed on the system by using one of the KOALA's placement policies. These policies are Close-to-Files (CF), Worst Fit (WF), Flexible Cluster Minimization (FCM), Cluster Minimization (CM), and FCM with optimized wide area communication. More about these policies can be read here, here and here.

Phase 2 is composed of starting and managing the file transfers for the job that it has been placed successfully in phase 1. File transfers are carried out by either the Globus GridFTP, or the Secured FTP (SFTP) depending on the type of the runner used.

In phase 3, while the job is in the claiming queue, attempts to claim the reserved processors for the job components are made at designated times. The job is inphase 4 if all of its components have been launched on their respective execution sites after the success of claiming in phase 3. After the job execution, the output files produced are sent back to the site used to launch the job (the submission site).

To top

What application type does KOALA support?

In general, KOALA should support all application types. This is accomplished by simply adding a runner, which implements special requirements specific to the application type in question. For instance, currently for applications that use co-allocation, a major issue lies on the type of communication library they use. KOALA has the runner called OMRunner for parallel Open MPI (OMPI) jobs, for which either myrinet or gigabit ethernet is used for communication. Ther faster interconnect, which is available in all the selected clusters for the job is preferred. KOALA has also the DRunner for jobs which use MPICH-G2, for which it relies on the DUROC component of the Globus toolkit. For Ibis jobs that use special communication library written in Java, KOALA has the IRunner. Ibis has been developed at the Vrije Universiteit in Amsterdam. With the new release, KOALA now also supports running parameter sweep applications (PSAs) with the CSRunner. It should be noted that support for new application types in KOALA can be added easily by simply writing a runner for that application type. More about the runners can be found here.

To top

What are the KOALA job requests?

Job Request

A job consists of one or more job components, which collectively perform a useful task for a user. The job components contain information such as their numbers and speeds of processors, the sizes and locations of input files, their memory requirements, and the required runtime libraries necessary for scheduling and executing an application across the grid. Job requests contain a detailed description of the job components just described. A job request may or may not specify its execution sites and the numbers and sizes (in terms of the number of processors) of its job components. Based on this, KOALA supports four cases for the structure of the job requests described below and shown in the figure above.

  • Fixed request: The job request specifies the numbers of processors it needs in all clusters from which processors must be allocated for its components.
  • Non-fixed request: The job request only specifies the numbers of processors required by its components, allowing the KOALA scheduler to choose the execution sites.
  • Semi-fixed: The job request is a combination of a fixed and a non-fixed request.
  • Flexible request: The job request only specifies the total number of processors it requires. It is left to the scheduler to split up the job and to decide on the number of components, the number of processors for each component, and the execution sites for the components.

To top

What are the job priorities in KOALA?

KOALA has four priority levels, which are super-high, high, low, and super-low , which are assigned to jobs. We have limited this number to four based on the types of jobs, presented below, which are common in grids:

  • Interactive jobs. These are jobs that run interactively and require quick responses. To avoid delaying them, the super-high priority level is assigned to them.
  • Occasional jobs. These are batch jobs that are submitted for special occasions, such as demoes, tight deadlines, etc. The high priority level is assigned to these jobs.
  • Batch jobs. These are normal batch jobs that run without any special requirements. They are assigned the low priority level.
  • Cycle-scavenging jobs. These jobs scavenge machines for available CPU cycle and are assigned super-low priority level.

The four priority levels can also be assigned to jobs base on a system policy. For instance, On the DAS, KOALA assigns priorities based on the estimated job runtimes, with longer jobs having lower priorities.

To top

Why does my runner hang?

To understand what is really going on first user the "-l DEBUG" flag to enable debug logs. If you still can not see the problem, then there are usualy two causes. First of all, check you .bashrc file, and make sure that the required modules are loaded. You can start with the minimal .bashrc file in the Using Koala section. Second, make sure that you can make passwordless SSH to all the headnodes. Configuring passwordless SSH is also described in the Using Koala section.

To top

What is the status of KOALA?

KOALA, which has been operational on the DAS-2 testbed since September 2005 and on the DAS-3 since May 2007, has been used successfully with different projects on the DAS. Because of its modular structure and its ease of use, KOALA has become a common tool to be used in new research. Currently, there is ongoing work of adding workflow execution support to KOALA.

To top

KOALA News

  • January 2013: MR-Runner upgraded! Now the MR-Runner deploys Hadoop-1.0.0 clusters, compatible with Pig-0.10.0. 

  • December 2012KOALA 2.1 released! Deploy MapReduce clusters on DAS-4 with the Koala MR-Runner

  • November 2012:  Best Paper Award at MTAGS12 workshop (co-located with SC12) with work on MapReduce!

  • November 2009KOALA 2.0 released! You can now run Parameter sweep applications (PSAs) with KOALA CSRunner

  • April 2008: New KOALA runner! The OMRunner enables DRMAA and OpenMPI job submissions. 

  • July 2007: Paper accepted at Grid07 conference with work on scheduling malleable jobs in KOALA.

  • May 2007: KOALA has now been ported successfully to DAS-3. All the KOALA runners are operational apart from the DRunner.

  • April 2007: The KOALA IRunner has been updated to include recommendations made by the Ibis group