The National Institute for Computational Sciences

HPC Glossary

HPC Glossary

This is a list of basic terms that might be used in HPC. For more information about HPC and what it does, see What Is HPC?

BlueGene
IBM's current line of supercomputers is known as BlueGene, and is largely based on IBM's Cell processor, which also powers the Playstation 3.
Cabinet
The nodes of a supercomputer are physically mounted into cabinets, (which contain racks), which also contain the networking and cooling systems, much like traditional servers.
CLI
A Command Line Interface is a user interface for computers which uses typed commands rather than buttons or graphical features as would be seen in a Graphical User Interface. For example, Microsoft's DOS and GNU's shell (ie, bash) are CLI's. Command Line Interfaces can also be found in 3rd party programs such as Matlab.
CPU
CPU stands for Central Processing Unit, and is the part of a computer which executes software programs. The term is not specific to a particular method of execution: units based on transistors, relays, or vacuum tubes might be considered CPU's. However, for clarity, we will use the term to refer to individual silicon chips, such as Intel's Pentium or AMD's Athlon. Thus, a CPU contains one or more cores, however, an HPC system may contain many CPU's. For example, Kraken contains several thousand AMD Opteron CPU's.
Core
A core is an individual processor: the part of a computer which actually executes programs. CPUs used to have a single core, and the terms were interchangeable. In recent years, several cores, or processors, have been manufactured on a single CPU chip, which may be referred to as a multiprocessor. It is important to note, however, that the relationship between the cores may vary radically: AMD's Opteron, Intel's Itanium, and IBM's Cell have very distinct setups.
Cyberinfrastructure
Cyberinfrastructure consists of computing systems, data storage systems, data repositories and advanced instruments, visualization environments, and people, all linked together by software and advanced networks to improve scholarly productivity and enable breakthroughs not otherwise possible.
DoE
The Department of Energy runs the US National Laboratories, such as Oak Ridge National Laboratory.
ftp
A protocol or utility which is used to transfer files over a network connection. For security, use the related sftp.
GPU
Graphics cards contain GPU's, or Graphics Processing Units, for processing visual data. These processors have a different architecture from standard CPUs, and are considered a promising technology for small-scale parallel processing in the future. Many of the projects which once required supercomputers may be done by single GPU cards in the future.
GUI
A Graphical User Interface is a visual medium for users to input commands, usually using the mouse and keyboard, as opposed to a Command Line Interface, which uses typed commands. You are probably using a GUI right now to read this page, for example.
HPC
High Performance Computing is the term often used for large-scale computers and the simulations and models which run on them.
HPSS
A single research group may create many Terabytes of data so it is important to have some place to store this data. HPSS or the High Performance Storage System is shared between NICS and NCCS, and consists of several Petabytes of disk and tape storage.
Instruction-level Parallelism
Within an individual core, there are individual sections of a processor which perform different tasks. You might think of it like an assembly line: with no instruction pipelining, or parallelism, the first person in the assembly line would sit idly until the last person in line finished. Thus, each person would spend most of their time waiting for their turn to work. Instruction pipelining, in the ideal case, means that the each person starts on the next part as soon as they finish with the previous part, so there is no waiting. A more in-depth description can be found at Wikipedia.org. Since this type of parallelism is implemented within a CPU, a programmer doesn't generally have to worry about this, as opposed to thread parallelism.
Linux
Linux is an operating system, similar to UNIX, which is becoming quite popular for supercomputers due to abundant support, user familiarity, and comparable performance with optimized UNIX systems. Kraken, for example, runs on a modified version of Linux.
NCCS
National Center for Computational Sciences is the Department of Energy's supercomputing center at Oak Ridge National Laboratory.
NICS
The National Institute for Computational Sciences is the University of Tennessee's center. The center is located at Oak Ridge National Laboratory, which allows it to share many resources with the Department of Energy's supercomputing center, NCCS. Simulations from researchers all over the country are run on the NICS computer, Kraken.
Node
In traditional computing, a node is an object on a network. For example, on a home network, your computer, router, and printer might all be nodes. Supercomputers like Kraken are essentially networks, with nodes that communicate with each other to solve a larger problem than any singular computer could in a reasonable amount of time. Kraken contains several types of nodes; compute nodes are the work-horses of the system, and are much like a stripped-down computer. An I/O node is the interface between the compute nodes and other computers, that is, it deals with input and output for the system.
Open Research
Open research is research that is not protected by proprietary claims or classified by the government.
ORNL
Oak Ridge National Laboratory: the Department of Energy's site in east Tennessee.
Scratch Space
Supercomputers generally have what is called scratch space: disk space available for temporary use. It is analogous to scratch paper. This may be thought of as a desk: it is where papers are stored while they are waiting to be worked on or filed away.
ssh
A protocol for securely connecting to a remote computer, or also a program which uses this protocol. This connection is generally for a command line interface, but it is possible to use GUI programs through SSH. For more information about how to use SSH, see Access.
Thread Parallelism
Thread Parallel refers to the method of splitting a program up into semi-independent threads. For example, if I needed to clean the kitchen and the bathrooms, I could do the kitchen, then the bathrooms—this might be considered thread-serial. If I had a helper, however, they could clean the kitchen while I cleaned the bathrooms. Two related tasks are being carried out independently (and simultaneously), so it is an analogy for thread parallel execution. This contrasts with other forms of parallelism, such as Instruction-Parallelism.
top500.org
Top500.org is a list of the fastest computers in the world, compiled twice a year.
UNIX
UNIX is an operating system first developed in the 1970's. It has gone through a number of incarnations, and still has many popular versions. UNIX has dominated supercomputing for many years, however, the high performance computing community has been increasingly turning to Linux for an operating system.
XSEDE
The National Science Foundation funds a group of open research oriented supercomputing sites. Researchers may go to XSEDE and obtain allocations to run their programs at one or more of the centers. Also, the centers work together to share resources and expertise and further scientific understanding
XT
Cray's current line of supercomputers. Kraken is an XT5 system.