In 2021, more than 1,862 data breaches affected nearly 294 million Americans.
This is troublesome for researchers working with machine learning since they use enormous data sets which may contain sensitive personal information.
Jian Liu, an assistant professor in the Min H. Kao Department of Electrical Engineering and Computer Science, is looking for a reliable way of using federated learning to mitigate the privacy risks to sensitive data in machine learning.
While most machine learning models use centralized training, which requires users upload their personal data to a remote server, federated learning distributes the training process to a user’s device. The model learns from users’ data without recording that data onto an external server.
“Recent studies have revealed that private information can still be leaked through shared model updates. In addition, there are billions of mobile devices which can participate in the model learning. We cannot know whether a particular device is trustworthy or if it is malicious,” Liu said.
Liu’s StART project team is trying to develop attack-resilient and privacy-preserving artificial intelligence models. They have published four papers so far.
“We proposed a method that can serve as a tool for empirically measuring the amount of privacy leakage in federated learning to facilitate the design of more robust defense mechanisms,” he said. “We also proposed a secure aggregation rule that can mitigate potential failures and attacks in federated learning. And we designed and evaluated privacy-preserving federated models that can help diagnose Alzheimer’s disease and depression based on speech.”
Liu’s team―which includes Jiaxin Zhang, a staff scientist in ORNL’s Computer Science and Mathematics Division, and Luyang Liu, a research scientist at Google―plan to submit a National Science Foundation proposal this year.