[Paper] PruneX: A Hierarchical Communication-Efficient System for Distributed CNN Training with Structured Pruning
Inter-node communication bandwidth increasingly constrains distributed training at scale on multi-node GPU clusters. While compact models are the ultimate deplo...