Advertisement · 728 × 90

Posts by Benedikt Riedel

Not agreeing with the actions. The dataset itself is skewed. The bulk of NSF awards are 3 years long (NSF prides itself in this). Most Trump 1.0 grants have completed or are in NCE at times point.

11 months ago 0 0 0 0

Isn’t FY25 kind of done deal with the CR that passed in March?

1 year ago 0 0 1 0

Is it run by grown-ups if it ends the same way as WeWork?

1 year ago 0 0 0 0
Preview
Scaling distributed training with AWS Trainium and Amazon EKS | Amazon Web Services Recent developments in deep learning have led to increasingly large models such as GPT-3, BLOOM, and OPT, some of which are already in excess of 100 billion parameters. Although larger models tend to ...

Bulk data storage object stores are great. In my experience taking the POSIX layer away from users is difficult without significant investment into the metadata service (as you pointed out in the blog).
This AWS blog post wants you to use Lustre for ML training:
aws.amazon.com/blogs/machin...

1 year ago 0 0 1 0

Really good read! Is this specific to LLMs? There has been a lot of investment into getting parallel file systems into the cloud, so what they are good for? "Classic" HPC workloads?

1 year ago 0 0 1 0

I gotta ask. Have the investment decisions been affected by things like Deepseek, the lowering cost of inference or the next step in AI needing more dev time than compute time?

1 year ago 0 0 0 0

Ground News might be a good aggregator

1 year ago 1 0 0 0

More on the point… GEANT exists, ESNet exists. The issue is mostly security and setup on the DOE-side. Europe has the same security issues as DOE. If they can’t trust each other over a network that they own and operate, I don’t see this going far beyond setting up Globus endpoints or the equivalent

1 year ago 1 0 0 0

The biggest hurdle with IRI will be the security aspects and technical choices. I don’t see Aurora or Frontier turning into a Perlmutter-like system with IRI, but they should. NAIRR is separate from ACCESS really. NAIRR and ACCESS use the same resources, but not much else.

1 year ago 0 0 1 0
Preview
Home - Access Following its highly successful Extreme Science and Engineering Discovery Environment (XSEDE) project, the National Science Foundation (NSF) is excited to introduce new advances in innovative cyberinf...

TeraGrid and XSEDE fell to the NSF “rule” that after 10 years you need to be a new project. ACCESS is the current iteration and it being used. ACCESS has been stripped down though significantly from TeraGrid and XSEDE. It mostly handles allocations, user portal, and metrics now. access-ci.org

1 year ago 2 0 0 0
Advertisement

I can’t just mount the necessary shared objects from the system I have to install them myself in the container. Again I haven’t spent more than a couple days on this, so there may be a solution to all of this but not documented (yet).

1 year ago 0 0 1 0

The container support is also not great. For example, there appears to be a difference between the packages on RHEL and Ubuntu. I can get Ubuntu containers to work but not RHEL. Great if you do ML not great if you don’t.

1 year ago 0 0 1 0

There appear multiple copies of shared objects whose distinction isn’t clear. For example there is a libOpenCL.so in multiple places but the OpenCL shared object for GPUs has a different name (libigdrcl.so).

1 year ago 0 0 1 0

This is a work in progress and we use OpenCL for the accelerated code so I understand that the support isn’t a first citizen. There are so many packages for OneAPI: 100+. OneAPI packages create a lot of directories (a good fraction that are empty) as well.

1 year ago 0 0 1 0

Underneath it is, but Intel has extended it and filed off the edges as far as I am aware.

1 year ago 1 0 0 0

1+ million dying and millions more disabled from an airborne disease didn’t change people’s mind. Do you really think a loved one getting sick will?

1 year ago 1 0 3 0

Maybe OneAPI will follow DAOS into it’s own foundation. I mean from what I heard at SC, Intel GPU deployments are underutilized. And from playing with OneAPI it gets messy quickly.

1 year ago 0 0 2 0

The package “todonotes” might also be handy

1 year ago 0 0 1 0
Advertisement

It really is 18 month cycle with the House/Senate election cycle.

1 year ago 1 0 0 0

Academia tends to hold onto ideas that might not be readily commercialized in the beginning, see mRNA vaccines.

1 year ago 0 0 1 0

“Because leadership is interested in science” is one line I heard today. Industry also tends to glom on to a certain tech (see the LLM craze) and seemingly forgets all the other use cases, except maybe in their respective niche, that could be commercialized.

1 year ago 0 0 1 0

We were in the B302 corner

1 year ago 1 0 0 0

Over on the workshop side we only had unsweet tea

1 year ago 1 0 1 0

It sounded like Arc is having issues. At least their Win versions seems to circling the drain. I have been using at sigma.os as a replacement

1 year ago 0 0 1 0

An iPhone does a pretty good job at the pedometer. For all the other metrics you would need a ring

1 year ago 0 0 0 0
Advertisement

Have you tried different papers? I love the Diamine Apple Green and haven’t had an issue with it.

1 year ago 0 0 0 0