Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

The increased usage of IoT networks brings about new privacy risks, especially when intrusion detection systems (IDSs) rely on large datasets for machine learning (ML) tasks and depend on third parties for storing and training the ML-based IDS. This study proposes a privacy-preserving synthetic data generation method using a conditional tabular generative adversarial network (CTGAN) aimed at maintaining the utility of IoT sensor network data for IDS while safeguarding privacy. We integrate differential privacy (DP) with CTGAN by employing controlled noise injection to mitigate privacy risks. The technique involves dynamic distribution adjustment and quantile matching to balance the utility–privacy tradeoff. The results indicate a significant improvement in data utility compared to the standard DP method, achieving a KS test score of 0.80 while minimizing privacy risks such as singling out, linkability, and inference attacks. This approach ensures that synthetic datasets can support intrusion detection without exposing sensitive information.

Original languageEnglish
Article number7389
JournalSensors
Volume24
Issue number22
DOIs
StatePublished - Nov 2024

Keywords

  • data utility
  • deep learning
  • differential privacy
  • generative adversarial network
  • Internet of things
  • intrusion detection systems

Fingerprint

Dive into the research topics of 'Privacy-Preserving Synthetic Data Generation Method for IoT-Sensor Network IDS Using CTGAN'. Together they form a unique fingerprint.

Cite this