I've been carrying around an evolving Glossary of Terms over the past many years of this career, and so today brings an accounting of this sharable reference. Perhaps the terms will clarify some aspects of my writings, at least the ones in my professional life involving HPC and Ai/ML storage and systems architecture. Neurology is another story, so perhaps a future post will have a Neuro-Glossy soon.
Glossary of Terms: Systems, Storage, Network
The following list of terms is never complete, new concepts added and adapted over time.
ACL (Access Control List)
A list of permissions attached to an object (file/folder) specifying which users or system processes can access it and what operations are allowed. NFSv4 ACLs and Windows ACLs are two types (the former more POSIX-oriented, the latter more detailed). ACLs are often used as rules defining access permissions on an object (file/directory). In storage systems, ACLs specify which users or groups can read, write, or execute a file. NFSv4 and Windows filesystems support ACLs beyond basic owner/group/mode bits.
AFM (Active File Management)
A feature of IBM Spectrum Scale that allows caching and sync of data between clusters (for multi-site or hierarchical storage).
All-Flash
Refers to storage systems that use all flash memory (SSD/NVMe) as opposed to spinning hard disks. Provides much higher IOPS and throughput, beneficial for HPC and low-latency needs.
CSI (Container Storage Interface)
Not explicitly covered above, but contextually, CSI drivers allow these storage systems to integrate with container orchestration (e.g., Kubernetes) for dynamic provisioning.
Common Criteria (NIAP)
A framework (with Evaluation Assurance Levels, EAL) to certify IT products’ security. NIAP is U.S.’s National Information Assurance Partnership. e.g., an EAL2 certification might be required for government use of a storage device.
Deduplication & Compression
Data reduction techniques. Dedup finds identical blocks and stores one copy, saving space. Compression algorithmically reduces data size. Effective in reducing cost per logical stored TB, especially on systems like VAST, Pure.
Disaggregated Architecture
A design where compute (protocol processing) is separated from storage media enclosures. For example, VAST uses disaggregated compute nodes (CNodes) and storage boxes (DBOXes) connected via NVMeoF.
Erasure Coding
A data protection method that breaks data into parts, encodes it with redundancy, and stores across multiple disks/nodes such that it can withstand some failures. Provides space efficiency vs full replicas. This method for spreading data and parity across drives/nodes also allows data recovery if some parts are lost. More space-efficient than mirroring, used by Scality, Ceph, etc.
FIPS 140-2 / 140-3
U.S. government standards (older 140-2 and newer 140-3) for validating cryptographic modules. If a product is FIPS 140-3 validated, its encryption has been tested and approved for use in government systems (high assurance).
GPFS (General Parallel File System)
The old name for IBM Spectrum Scale, a distributed filesystem known for high performance in HPC.
GPUDirect Storage
A technology by NVIDIA that allows GPUs to directly perform IO to storage (bypassing CPU) if the storage and network support it. Reduces latency for GPU training data ingestion.
IOPS (Input/Output Operations Per Second)
A metric of how many read/write operations can be done in a second. Often used for measuring performance with small blocks (e.g., 4K). High IOPS with low latency indicates a system good for random small file workloads.
ISO 27001
An international standard for information security management. A product isn’t certified per se, but companies can certify their processes. Vendors adhering to 27001 ensure their product development and support follow security best practices.
Immutable Snapshot
A point-in-time copy of data that cannot be altered or deleted until certain conditions are met (used for WORM compliance, ransomware protection).
IB
Short for for Infiniband
InfiniBand
A high-speed networking technology with RDMA, often used in HPC clusters for low latency and high bandwidth (e.g., 100 Gbps EDR, 200 Gbps HDR).
KMS (Key Management System)
External system to manage encryption keys (e.g., HashiCorp Vault, Thales). Many storage products integrate with KMS via KMIP (Key Management Interoperability Protocol).
Latency (storage context)
The time it takes to complete an I/O operation (e.g., read or write). Low latency is critical for small random IOPS-heavy workloads (like database or AI metadata). Usually measured in milliseconds or microseconds.
NFS (Network File System)
A client-server file sharing protocol allowing file access over a network. Common in UNIX/Linux environments for shared storage. A distributed file system protocol allowing remote file access as if local. v3 is stateless, v4 adds state, ACLs, and optional pNFS for parallel access.
NIST SP 800-53
A catalog of security and privacy controls for federal information systems. Being “aligned to NIST 800-53” means the system supports many of the controls (like AC- access control, IA- identification & authentication, etc.).
NVMe (Non-Volatile Memory Express)
A high-performance interface for SSDs (especially PCIe SSDs). NVMe drives have low latency and high throughput, extensively used in modern flash arrays.
NVMe-oF (NVMe over Fabrics)
A protocol to use NVMe (fast flash interface) over a network fabric like Ethernet or InfiniBand, allowing remote NVMe drives to be accessed with near-local performance. Extends NVMe protocol over network fabrics (Ethernet, InfiniBand) so that remote NVMe devices can be accessed with similar efficiency to local NVMe.
Object Storage
Storage that manages data as objects (with a unique key, metadata, and data), accessible via APIs like S3 or Swift rather than as a file hierarchy. Scalable to very large capacities and often easier to distribute than filesystems.
POSIX
A family of standards for Unix-like operating systems. “POSIX file system” implies traditional semantics (hierarchical directories, byte-addressable files, etc.). HPC codes often assume POSIX compliance for file I/O. The standard for maintaining compatibility among operating systems. In file systems, POSIX compliance means supporting expected behaviors of a UNIX file system (permissions, atomic operations, etc.).
QoS (Quality of Service)
The ability to manage and guarantee certain performance levels (throughput or IOPS) to certain workloads or clients. For storage, some systems allow QoS limits or reservations per user/share.
RBAC (Role-Based Access Control)
Admin/users roles with certain permissions within a system’s management (e.g., admin, read-only admin, security officer roles). Many storage systems have RBAC in their management UI.
RDMA (Remote Direct Memory Access)
Technology to directly transfer data between computers’ memory over network, bypassing CPU to reduce latency (used in InfiniBand, RoCE networks).
RoCE (RDMA over Converged Ethernet)
A protocol to run RDMA over Ethernet networks by making them lossless (using priority flow control). Provides InfiniBand-like performance on Ethernet gear.
S3 (Simple Storage Service API)
An object storage protocol (originally from AWS S3) for storing/retrieving whole objects (files) via HTTP. Many systems implement S3 for scale-out storage.
SMB (Server Message Block)
A network file sharing protocol predominantly used by Windows. SMB3 supports encryption and signing for security. Also known as CIFS in older versions. Allows file and printer sharing in LAN.
SPC-1 / SPEC SFS
Industry benchmarks for storage. SPC-1 measures IOPS in a database-like workload. SPEC SFS measures NFS or SMB throughput and IOPS for different workloads (like SPEC SFS2014_swbuild, etc. for software build, VDA for video streaming).
STIG (Security Technical Implementation Guide)
Guidelines used by U.S. DoD to secure systems (specific configurations to harden OS/applications).
Scale-Out vs Scale-Up
Scale-out means increasing capacity/performance by adding more nodes (horizontal scaling), whereas scale-up means using bigger hardware (vertical scaling). Scale-out is generally preferred for very large systems (e.g., Spectrum Scale, Scality, Qumulo are scale-out; Oracle ZFS is scale-up).
Snapshot
A read-only (sometimes read-write) copy of the filesystem state at a point in time. Useful for backups, quick recovery, or cloning datasets.
Throughput/Bandwidth
The rate of data transfer, typically measured in GB/s for these systems. Important for large file reads/writes (like streaming datasets for training or writing simulation output).
Tiering
Storage medium resource management design involving algorithmic methods for moving data between different storage performance types, with the goal of combining performance and capacity scaling into a model-able system with predictable KPIs. OpenZFS and its ARC (Adaptive Replacement Cache) is the best example of tiered data structure design in OSS filesystems: L1ARC == DRAM, L2ARC == NVMe, SLOG/ZLOG/S-vDev == PMem-NVDIMM, with the equivalent of L3ARC being SAS3/SAS4 or SATA3 large format for the remaining pool size.
Tiering-HSM
The "Hierarchical Storage Management" concept where data moves between tiers (hot data on DRAM (eg L1ARC on ZFS), cache-warm on SSD or NVMe, plus dual-head SAS for cold/large block storage, optionally with archive data on tape or object). Implemented in Spectrum Scale (as ILM policies), Weka (to object), etc.
WORM (Write Once Read Many)
A storage feature where data, once written, cannot be altered or deleted for a defined retention period. Used for compliance (financial records, medical data archiving). Implemented often via immutable snapshots or object lock.
Zero Trust Architecture
A security model where no implicit trust is given; even internal components must authenticate/authorize. In storage, features like requiring tokens for mount, or rootless admin, align with zero trust principles (assume breach and minimize attack surface).
mdtest/IOR/fio
Common benchmarking tools. mdtest generates many small file metadata ops. IOR (Interleaved Or Random) simulates parallel I/O (often used in HPC for throughput). fio is a flexible I/O tool that can do various patterns (used for random IOPS, etc.).
pNFS (Parallel NFS)
An extension to NFSv4.1 that allows clients to directly access storage server nodes in parallel, rather than funneling all data through a single server. Improves throughput by eliminating bottleneck at a single NFS server. Part of NFSv4.1 that allows parallel data access from multiple storage servers to improve performance. Not widely implemented by all vendors, but conceptually eliminates single NFS server bottleneck.
Member discussion