Software-defined network (SDN) is widely used in smart grid for monitoring and managing the communication network. Big data analytics for SDN-based smart grid has got increasing attention. It is a promising approach to use machine learning technologies to analyze a large amount of data generated in SDN-based smart grid. However, the disclosure of personal privacy information must receive considerable attention. For instance, data clustering in user electricity behavior analysis may lead to the disclosure of personal privacy information. In this paper, an optimizing and differentially private clustering algorithm named ODPCA is proposed. In the ODPCA, the differentially private K-means algorithm and K-modes algorithm are combined to cluster mixed data in a privacy-preserving manner. The allocation of privacy budgets is optimized to improve the accuracy of clustering results. Specifically, the loss function that considers both the numerical and categorical attributes between true centroids and noisy centroids is analyzed to optimize the allocation the privacy budget; the number of iterations of clustering is set to a fixed value based on the total privacy budget and the minimal privacy budget allocated to each iteration. It is proved that the ODPCA can meet the differential privacy requirements and has better performance by comparing with other popular algorithms.
ASJC Scopus subject areas
- コンピュータ サイエンス（全般）