ATTENDING

  • Jerry Sheehan, Pol Llovet, Aurelien Mazurie, Thomas Heetderks

ABSENT

MINUTES: HPCAG Meeting #4 – (meeting presentation)

  1. Welcome - Jerry
  2. Hyalite XDMOD Usage Summary - Pol
    • XDMOD stats: March thru May 23
      • Active Users: 35
      • Total Jobs: 85,409
      • Total CPU Hours: 1,687,511
      • Average Job: 19.86 hours
      • Average Wait Time: 20 hours
    • Very Deep Queue in April
      • >9K jobs deep
      • Most from a single researcher
      • Showed "fair-share" algorithm issues
      • These issues are being addressed
  3. Hyalite Student Usage - Pol
    • Student jobs 57% of cluster hours
    • Total CPU Hours: 7,943,855.90 (since cluster installation)
      • Student CPU Hours: 4,542,543.47 (57.18%)
      • Graduate CPU Hours: 4,286,823.10 (53.96%)
      • Undergrad CPU Hours: 255,720.37 (3.22%)
    • disucssion: grad student work is research (traditionally), NOT "learning"
    • Jerry: facilitating "learning"- course work might better be served with a wrapper such as Jupyterhub
  4. Researcher Introduction: Joe Atwood, Agriculture Economics - Jerry
    • Onboarded on Hyalite and setup R environment and Rstudio modules
    • The weather data contains monthly total prcp and monthly average daily temp data for the years 1900‐2009
    • Work is pleasantly parallel on millions of linear calculations
  5. Hyalite Expansion Status - Pol
    • 16 New Nodes to be installed soon
      • late but should ship soon (late this week)
      • should be installed with NO outage
    • QUOTE on High CPU, RAM, I/O Node
      • Quad E5-4648v3 CPU (96 HT cores)
      • 512GB RAM (1TB max)
      • 1TB Local SSD
      • $15,000 - $18,000
      • ~4x power, ~3.5x cost of existing node
  6. Research Data Census 2016 - Jerry
    • RDC2: Demographics
      • Most respondants: Faculty
      • -- followed by Staff
    • RDC2: Types of Data
      • Majority Text & Spreadsheets
      • There is an increase in Timeseries Data over last year
    • RDC2: Research Data Size
      • Majority: 10-100 GB
      • There's an overall increase in data sizes over last year
    • RDC2: Where do you store your data?
      • Most data on Office or Lab computer
      • -- followed by External HD
    • RDC2: Long term storage need?
      • Over half answered 1 TB
      • Second most popular: 10 TB
    • RDC2: Long term storage access methods?
      • Mapped drive on Desktop most popular (but not by huge margin)
      • Overall, responses very mixed
    • RDC2: Business Model
      • Majority prefer some other arrangement to Fee Based
      • Most want a standard allocation with special arrangements for additional
  7. Research Focus Groups Summary - Jerry
    • Two focus groups were held in May (mostly faculty)
    • Concern expressed about "finding" institutional data assets
    • Consistent with RDC2-- long term archive should be centrally supported
    • How data is uploaded of little concern
    • Next Steps
      • Synthesis of RDC and Data Focus Groups results
      • RFP Team named in June to development requirements
      • July target date for RFP publication
  8. Data Science Competition - Aurelien
    • Open to students from all disciplines
    • Students are invited to solve a data-rich problem through visualization and/or number crunching
    • Currently working with the Office of Sustainability
      • Competition around optimum deployment of new and existing bike racks across campus, taking into account pedestrian traffic, bike traffic, sidewalks and building locations, building occupancy
    • ETA: Fall 2016
    • disucssion: you should talk to Suzy Taylor (Extended University) about STEM Women

FUTURE AGENDA

  • Hyalite Communication & Publicity