TY - JOUR
T1 - BiPruneFL
T2 - Computation and Communication Efficient Federated Learning With Binary Quantization and Pruning
AU - Lee, Sangmin
AU - Jang, Hyeryung
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Federated learning (FL) is a decentralized learning framework that allows a central server and multiple devices, referred to as clients, to collaboratively train a shared model without transmitting their private data to a central server. This approach helps to preserve data privacy and reduce the risk of information leakage. However, FL systems often face significant communication and computational overhead due to frequent exchanges of model parameters and the intensive local training required on resource-constrained clients. Existing solutions typically apply compression techniques such as quantization or pruning but only to a limited extent, constrained by the trade-off between model accuracy and compression efficiency. To address these challenges, we propose BiPruneFL, a communication- and computation-efficient FL framework that combines quantization and pruning while maintaining competitive accuracy. By leveraging recent advances in neural network pruning, BiPruneFL identifies subnetworks within binary neural networks without significantly compromising accuracy. Additionally, we employ communication compression strategies to enable efficient model updates and computationally lightweight local training. Through experiments, we demonstrate that BiPruneFL significantly outperforms other baselines, achieving up to 88.1× and 80.8× more efficient communication costs during upstream and downstream phases, respectively, and reducing computation costs by 3.9× to 34.9× depending on the degree of quantization. Despite these efficiency gains, BiPruneFL achieves accuracy comparable to, and in some cases surpassing, that of uncompressed federated learning models.
AB - Federated learning (FL) is a decentralized learning framework that allows a central server and multiple devices, referred to as clients, to collaboratively train a shared model without transmitting their private data to a central server. This approach helps to preserve data privacy and reduce the risk of information leakage. However, FL systems often face significant communication and computational overhead due to frequent exchanges of model parameters and the intensive local training required on resource-constrained clients. Existing solutions typically apply compression techniques such as quantization or pruning but only to a limited extent, constrained by the trade-off between model accuracy and compression efficiency. To address these challenges, we propose BiPruneFL, a communication- and computation-efficient FL framework that combines quantization and pruning while maintaining competitive accuracy. By leveraging recent advances in neural network pruning, BiPruneFL identifies subnetworks within binary neural networks without significantly compromising accuracy. Additionally, we employ communication compression strategies to enable efficient model updates and computationally lightweight local training. Through experiments, we demonstrate that BiPruneFL significantly outperforms other baselines, achieving up to 88.1× and 80.8× more efficient communication costs during upstream and downstream phases, respectively, and reducing computation costs by 3.9× to 34.9× depending on the degree of quantization. Despite these efficiency gains, BiPruneFL achieves accuracy comparable to, and in some cases surpassing, that of uncompressed federated learning models.
KW - Federated learning (FL)
KW - Internet of Things
KW - lottery tickets
KW - neural network pruning
KW - quantization
UR - http://www.scopus.com/inward/record.url?scp=105001064417&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2025.3547627
DO - 10.1109/ACCESS.2025.3547627
M3 - Article
AN - SCOPUS:105001064417
SN - 2169-3536
VL - 13
SP - 42441
EP - 42456
JO - IEEE Access
JF - IEEE Access
ER -