This paper presents a novel approach for training large foundation models in decentralized heterogeneous environments, where different computational "tasklets" are allocated to devices connected by slow networks. The authors propose a scheduling algorithm and formal cost model to optimize the allocation strategy, achieving significant speedup over prior state-of-the-art systems. Extensive experiments demonstrate that their approach can reduce training time by up to 4.8X compared to existing methods, while also providing efficient network compression. By leveraging decentralized and heterogeneous networks, this work aims to make large-scale foundation model training more accessible and cost-effective.