Geeks With Blogs

@JReuben1
  • JReuben1 AngularJS Directive templateUrl --> halfway to W3C WebComponents ! about 560 days ago
  • JReuben1 Yeoman AngularJS generator - generate controllers, views, routes, services - NICE! about 561 days ago
  • JReuben1 A comparison of HTML5 Canvas 2D JS libs http://t.co/fcB7jvnhqY KineticJS , EaselJS, fabric.js, Paper.js, processing.js seen as the leaders about 562 days ago

Josh Reuben

HPC Job Types

HPC has 3 types of jobs http://technet.microsoft.com/en-us/library/cc972750(v=ws.10).aspx

· Task Flow – vanilla sequence

clip_image002

· Parametric Sweep – concurrently run multiple instances of the same program, each with a different work unit input

clip_image004

· MPI – message passing between master & slave tasks

clip_image006

But when you try go outside the box – job tasks that spawn jobs, blocking the parent task – you run the risk of resource starvation, deadlocks, and recursive, non-converging or exponential blow-up.

The solution to this is to write some performance monitoring and job scheduling code. You can do this in 2 ways:

  1. manually control scheduling - allocate/ de-allocate resources, change job priorities, pause & resume tasks , restrict long running tasks to specific compute clusters
  2. Semi-automatically - set threshold params for scheduling.

How – Control Job Scheduling

In order to manage the tasks and resources that are associated with a job, you will need to access the ISchedulerJob interface - http://msdn.microsoft.com/en-us/library/microsoft.hpc.scheduler.ischedulerjob_members(v=vs.85).aspx

This really allows you to control how a job is run – you can access & tweak the following features:

  • max / min resource values
  • clip_image007whether job resources can grow / shrink, and whether jobs can be pre-empted, whether the job is exclusive per node
  • clip_image007[1]the creator process id & the job pool
  • timestamp of job creation & completion
  • clip_image007[2]job priority, hold time & run time limit
  • Re-queue count
  • Job progress
  • Max/ min Number of cores, nodes, sockets, RAM
  • Dynamic task list – can add / cancel jobs on the fly
  • Job counters

When – poll perf counters

Tweaking the job scheduler should be done on the basis of resource utilization according to PerfMon counters – HPC exposes 2 Perf objects: Compute Clusters, Compute Nodes

http://technet.microsoft.com/en-us/library/cc720058(v=ws.10).aspx

You can monitor running jobs according to dynamic thresholds – use your own discretion:

  • Percentage processor time
  • Number of running jobs
  • Number of running tasks
  • Total number of processors
  • Number of processors in use
  • Number of processors idle
  • Number of serial tasks
  • Number of parallel tasks

Design Your algorithms correctly

Finally , don’t assume you have unlimited compute resources in your cluster – design your algorithms with the following factors in mind:

· Branching factor - http://en.wikipedia.org/wiki/Branching_factor - dynamically optimize the number of children per node

clip_image009

· cutoffs to prevent explosions - http://en.wikipedia.org/wiki/Limit_of_a_sequence - not all functions converge after n attempts. You also need a threshold of good enough, diminishing returns

· heuristic shortcuts - http://en.wikipedia.org/wiki/Heuristic - sometimes an exhaustive search is impractical and short cuts are suitable

· Pruning http://en.wikipedia.org/wiki/Pruning_(algorithm) – remove / de-prioritize unnecessary tree branches

clip_image011

· avoid local minima / maxima - http://en.wikipedia.org/wiki/Local_minima - sometimes an algorithm cant converge because it gets stuck in a local saddle – try simulated annealing, hill climbing or genetic algorithms to get out of these ruts

clip_image013

 

watch out for rounding errorshttp://en.wikipedia.org/wiki/Round-off_error - multiple iterations can in parallel can quickly amplify & blow up your algo ! Use an epsilon, avoid floating point errors,  truncations, approximations

Happy Coding !

Posted on Wednesday, October 10, 2012 2:34 PM Parallelism | Back to top


Comments on this post: HPC Server Dynamic Job Scheduling: when jobs spawn jobs

# re: HPC Server Dynamic Job Scheduling: when jobs spawn jobs
Requesting Gravatar...
Your site provided us with valuable information to work with.
By TeknoKeren.com | Berita Terkini
Left by iyus on Feb 21, 2013 6:38 AM

Your comment:
 (will show your gravatar)


Copyright © JoshReuben | Powered by: GeeksWithBlogs.net | Join free