ATTENDING

  • Jerry Sheehan, Josh Turner, Pol Llovet, Aurelien Mazurie, Jonathan Hilmer, Thomas Heetderks

ABSENT

MINUTES: HPCAG Meeting #7 – (meeting presentation)

  1. Welcome - Jerry
  2. Information - Pol & Aurelien
    • Hyalite Usage - Aurelien
      • Active Users
      • User Trends
        • Trends over 2016: stable growth
      • Jobs Run
      • Jobs Trends
        • Trends over 2016: mostly consistent use
    • Expansion Update - Pol
      • Installation completed Jan. 25
      • Added:
        • 1 Extra-large node (40 cores, 1.5 TB RAM)
        • 4 regular nodes (20 cores each, 256 GB RAM)
        • upgraded RAM for 12 regular nodes (from 64 to 256 GB)
      • The Extra-large node can take a GPU – but currently there is no real interest
      • All tested and running
  3. RAM Policy & Queue Changes - Pol
    • RAM is now tracked by SLURM
      • Default: 2GB RAM per CPU
      • Examples
        1. 16 CPU job, by default will be killed if it uses more than 32GB RAM
        2. 32 CPU job, will use all of a standard node
        3. 1 CPU job, 64GB RAM specified, will fully allocate a default node (previously, this would not fully allocate a node)
    • High RAM nodes: 16 existing nodes now with 256GB RAM
      • Members of the default and priority queues
      • Just specify more than 64GB RAM in your SLURM command
      • Example: to schedule a High RAM node job– # SBATCH -mem 256000
    • Queue changes to accommodate 256GB RAM nodes and Xlarge node
      • Two new queues
        • Xlarge: single node, pre-emptive queue for GravityLab (Yunes)
        • unsafe: all nodes (includes Xlarge node), queue for everyone
          • limitations:
            • jobs can be pre-empted by other queues
            • pre-empted jobs get re-queued
      • Xlarge node not part of the default queue
  4. Hyalite Documentation - Pol
  5. Discussion - Pol/Jerry/Jonathan
    • Heterogeneity
      • Xlarge node moves us in a new direction
      • With our job scheduling and management, it will work out fine
    • CFAC funding for classroom nodes (Robert Szilagyi class)
      • allowing us to keep classroom use separate
      • discussion: classroom use should be seperate from regular Hyalite use
    • MSU Cores: meetings & VPR
      • discussion: RCi and Hyalite should be a core
    • CyberCANOE Display Wall
    • USGS & ALCES FLIGHT

ACTIONS

  • We will send out to the group: information on ALCES FLIGHT & USGS

FUTURE AGENDA

  • Hyalite RAM Policy
  • Hyalite Storage Policy
  • Hyalite Expansion Update
  • CyberCANOE (display wall) installation