EFFECTIVE KNOWLEDGE REPRESENTATION AND UTILIZATION FOR SUSTAINABLE COLLABORATIVE LEARNING ACROSS HETEROGENEOUS SYSTEMS

Hoang Trong Nghia1,
1 School of Electrical Engineering and Computer Science, Washington State University, USA

Nội dung chính của bài viết

Tóm tắt

The increasingly decentralized and private nature of data in our digital society has motivated the development of collaborative intelligent systems that enable knowledge aggregation among data owners. However, collaborative learning has only been investigated in simple settings. For example, clients are often assumed to train solution models de novo, disregarding all prior expertise. The learned model is typically represented in task-specific forms that are not generalizable to unseen, emerging scenarios. Finally, a universal model representation is enforced among collaborators, ignoring their local compute constraints or input representations. These limitations hampers the practicality of prior collaborative systems in learning scenarios with limited task data that demand constant knowledge adaptation and transfer across information silos, tasks, and learning models, as well as the utilization of prior solution expertise. Furthermore, prior collaborative learning frameworks are not sustainable on a macro scale where participants desire fairness allocation of benefits (e.g., access to the combined model) based on their costs of participation (e.g., overhead of model sharing and training synchronization, risk of information breaches etc.). This necessitates a new perspective of collaborative learning where the server not only aggregates but also conducts valuation of the participant’s contribution, and distribute aggregated information to individuals in commensurate to their contribution. To substantiate the above vision, we propose a new research agenda on developing effective and sustainable collaborative learning frameworks across heterogeneous systems, featuring three novel computational capabilities on knowledge organization: model expression, comprehension and valuation.

Chi tiết bài viết

Tài liệu tham khảo

Bui, L. M., Huu, T. T., Dinh, D., Nguyen, T. M., & Hoang, T. N. (2024). Revisiting Kernel Attention with Correlated Gaussian Process Representation. In UAI.
Collins, L., Hassani, H., Mokhtari, A., & Shakkottai, S. (2021). Exploiting Shared Representations for Personalized Federated Learning. In Proc. ICML, 2089–2099.
Dhanaraju, M., Chenniappan, P., Ramalingam, K., Pazhanivelan, S., & Kaliaperumal, R. (2022). Smart Farming: Internet of Things (IoT)-Based Sustainable Agriculture. Agriculture, 12(10).
Fan, Z., Fang, H., Wang, X., Zhou, Z., Pei, J., Friedlander, M., & Zhang, Y. (2024). Fair and Efficient Contribution Valuation for Vertical Federated Learning. In The Twelfth International Conference on Learning Representations.
Ghorbani, A., & Zou, J. (2019). Data Shapley: Equitable Valuation of Data for Machine Learning. arXiv:1904.02868.
Ghorbani, A., Kim, M. P., & Zou, J. Y. (2020). A Distributional Framework for Data Valuation. CoRR, abs/2002.12334.
Hanzely, F., & Richtarik, P. (2020). Federated Learning of a Mixture of Global and Local Models. CoRR, abs/2002.05516.
Hassantabar, S., Stefano, N., Ghanakota, V., Ferrari, A., Nicola, G. N., Bruno, R., Marino, I. R., & Jha, N. K. (2020). CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks. CoRR, abs/2007.10497.
Hoang, M., & Hoang, T. N. (2024). Few-Shot Learning via Repurposing Ensemble of Black-Box Models. In AAAI.
Hoang, Q. M., Hoang, T. N., Low, K. H., & Kingsford, C. (2019a). Collective Model Fusion of Multiple Black-Box Experts. In Proc. ICML.
Hoang, T. N., Hoang, Q. M., Low, K. H., & How, J. P. (2019b). Collective Online Learning of Gaussian Processes in Massive Multi-Agent Systems. In Proc. AAAI.
Hoang, T. N., Lam, C. T., Low, K. H., & Jaillet, P. (2020). Learning Task-Agnostic Embedding of Multiple Black-Box Experts for Multi-Task Model Fusion. In Proc. ICML.
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., & Chen, W. (2021). LoRA: Low- Rank Adaptation of Large Language Models. CoRR, abs/2106.09685.
Itti, L., & Baldi, P. (2009). Bayesian Surprise Attracts Human Attention. Vision Research, 49(10), 1295–1306.
Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S., Stich, S., & Suresh, A. T. (2020). SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proc. ICML, 5132–5143.
Kingma, D., & Welling, M. (2013). Auto-Encoding Variational Bayes. In Proc. ICLR.
Konečný, J., McMahan, H. B., Ramage, D., & Richtárik, P. (2016). Federated Optimization: Distributed Machine Learning for On-Device Intelligence. CoRR, abs/1610.02527.
Lam, C. T., Hoang, T. N., Low, K. H., & Jaillet, P. (2021). Model Fusion for Personalized Learning. In Proc. ICML.
Li, T., Hu, S., Beirami, A., & Smith, V. (2021). Ditto: Fair and Robust Federated Learning Through Personalization. In Proc. ICML, 6357–6368.
Li, W., Fu, S., Zhang, F., & Pang, Y. (2023). Data Valuation and Detections in Federated Learning. ArXiv, abs/2311.05304.
McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proc. AISTATS, 1273–1282.
Mironov, I. (2017). Rényi Differential Privacy. In Proc. 30th IEEE Computer Security Foundations Symposium (CSF), 263–275.
Sim, R. H. L., Zhang, Y., Hoang, T. N., Xu, X., Low, B. K. H., & Jaillet, P. (2023). Incentives in Private Collaborative Machine Learning. In Proceedings of NeurIPS.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In NeurIPS, 5998–6008.
Wang, T., Rausch, J., Zhang, C., Jia, R., & Song, D. (2020). A Principled Approach to Data Valuation for Federated Learning. CoRR, abs/2009.06192.
Wang, Z., Zhang, Z., Lee, C., Zhang, H., Sun, R., Ren, X., Su, G., Perot, V., Dy, J. G., & Pfister, T. (2021). Learning to Prompt for Continual Learning. CoRR, abs/2112.08654.
Wei, S., Tong, Y., Zhou, Z., & Song, T. (2020). Efficient and Fair Data Valuation for Horizontal Federated Learning, 139-152. Cham: Springer International Publishing. ISBN 978-3-030-63076-8.
Yoon, J., Arik, S. O., & Pfister, T. (2019). Data Valuation using Reinforcement Learning. CoRR, abs/1909.11671.
Yurochkin, M., Agarwal, M., Ghosh, S., Greenewald, K., Hoang, T. N., & Khazaeni, Y. (2019a). Bayesian Nonparametric Federated Learning of Neural Networks. In Proc. ICML.
Yurochkin, M., Argawal, M., Ghosh, S., Greenewald, K., & Hoang, T. N. (2019b). Statistical Model Aggregation via Parameter Matching. In Proc. NeurIPS, 10954–10964.