Supercomputing memory management tool makes data storage more efficient

Spread the love
Wall of black computer chords with blue wiring

Researchers from the Department of Energy’s Oak Ridge National Laboratory have developed a new application to increase efficiency in memory systems for high-performance computing.

Rather than allow data to bog down traditional memory systems in supercomputers and impact performance, the team from ORNL, along with researchers from the University of Tennessee, Knoxville, created a framework to manage data more efficiently with memory systems that employ more complex structures. Research papers detailing their work were recently accepted in ACM Transactions on Architecture and Code Optimization and the International Journal of High-Performance Computing Applications.

Working under the Exascale Computing Project, or ECP, a multi-year software research, development and deployment project managed by DOE, ORNL senior computer science researcher Terry Jones and his team titled their work the “ECP Simplified Interface to Complex Memories,” or SICM, Project.

“Our work is to automatically put the frequently used objects into the right location in the faster tier of memory and put the less-used objects, the things that aren’t accessed as often, into the slower memory,” said Jones. “Our work shows it performs better than previous strategies.”

To optimize the vast amounts of data stored on high-performance computers like ORNL’s Frontier, the world’s first exascale supercomputer, scientists need ways to structure memory systems based on the need for the stored information. Memory systems that operate faster for information retrieval can be computationally expensive in that they require more computing power to complete, while systems that contain more data often operate at slower speeds.

Jones also said that, conventionally, memory systems have operated on a “first touch” principle, where data is stored in the fastest memory storage until it reaches capacity. However, in many cases, the initial stages of a program include elements that will only be used when the program is initiated, filling the fastest memory areas with elements that will no longer be needed.

“First touch is a less ideal approach for those kinds of applications,” Jones said. “Our approach uses more sophisticated techniques to determine if some data needs faster memory or not and can give you much better performance than first touch.”

Using the SICM system, information is automatically sorted and stored based on need, making retrieval significantly more efficient and allowing developers to write programs that better use the full capacity of supercomputing systems. Moving forward, this tool will enable multiple programs with different storage needs to function within a single supercomputing rack through a new technology called CXL.

“Imagine that within a rack of a supercomputer there’s a lot of memory, and all the nodes inside that same rack could get whatever they need from that memory,” Jones said. “So, if inside a rack there are multiple programs, such as an AI application and a complex calculation on a small dataset, the AI program will need lots of memory, but the complex calculation will not need as much memory. Dynamically inside that rack, we could have this memory move around while those two codes are running.” https://www.ornl.gov/news/supercomputing-memory-management-tool-makes-data-storage-more-efficient