[[2209.04149] Post-Quantum Oblivious Transfer from Smooth Projective Hash Functions with Grey Zone](http://arxiv.org/abs/2209.04149)
Oblivious Transfer (OT) is a major primitive for secure multiparty computation. Indeed, combined with symmetric primitives along with garbled circuits, it allows any secure function evaluation between two parties. In this paper, we propose a new approach to build OT protocols. Interestingly, our new paradigm features a security analysis in the Universal Composability (UC) framework and may be instantiated from post-quantum primitives. In order to do so, we define a new primitive named Smooth Projective Hash Function with Grey Zone (SPHFwGZ) which can be seen as a relaxation of the classical Smooth Projective Hash Functions, with a subset of the words for which one cannot claim correctness nor smoothness: the grey zone. As a concrete application, we provide two instantiations of SPHFwGZ respectively based on the Diffie-Hellman and the Learning With Errors (LWE) problems. Hence, we propose a quantum-resistant OT protocol with UC-security in the random oracle model.
[[2209.04006] What is Software Supply Chain Security?](http://arxiv.org/abs/2209.04006)
The software supply chain involves a multitude of tools and processes that enable software developers to write, build, and ship applications. Recently, security compromises of tools or processes has led to a surge in proposals to address these issues. However, these proposals commonly overemphasize specific solutions or conflate goals, resulting in unexpected consequences, or unclear positioning and usage.
In this paper, we make the case that developing practical solutions is not possible until the community has a holistic view of the security problem; this view must include both the technical and procedural aspects. To this end, we examine three use cases to identify common security goals, and present a goal-oriented taxonomy of existing solutions demonstrating a holistic overview of software supply chain security.
[[2209.04028] Evaluating the Security of Aircraft Systems](http://arxiv.org/abs/2209.04028)
The sophistication and complexity of cyber attacks and the variety of targeted platforms have been growing in recent years. Various adversaries are abusing an increasing range of platforms, e.g., enterprise platforms, mobile phones, PCs, transportation systems, and industrial control systems. In recent years, we have witnessed various cyber attacks on transportation systems, including attacks on ports, airports, and trains. It is only a matter of time before transportation systems become a more common target of cyber attackers. Due to the enormous potential damage inherent in attacking vehicles carrying many passengers and the lack of security measures applied in traditional airborne systems, the vulnerability of aircraft systems is one of the most concerning topics in the vehicle security domain. This paper provides a comprehensive review of aircraft systems and components and their various networks, emphasizing the cyber threats they are exposed to and the impact of a cyber attack on these components and networks and the essential capabilities of the aircraft. In addition, we present a comprehensive and in-depth taxonomy that standardizes the knowledge and understanding of cyber security in the avionics field from an adversary's perspective. The taxonomy divides techniques into relevant categories (tactics) reflecting the various phases of the adversarial attack lifecycle and maps existing attacks according to the MITRE ATT&CK methodology. Furthermore, we analyze the security risks among the various systems according to the potential threat actors and categorize the threats based on STRIDE threat model. Future work directions are presented as guidelines for industry and academia.
[[2209.04056] Generating Contextual Load Profiles Using a Conditional Variational Autoencoder](http://arxiv.org/abs/2209.04056)
Generating power system states that have similar distribution and dependency to the historical ones is essential for the tasks of system planning and security assessment, especially when the historical data is insufficient. In this paper, we described a generative model for load profiles of industrial and commercial customers, based on the conditional variational autoencoder (CVAE) neural network architecture, which is challenging due to the highly variable nature of such profiles. Generated contextual load profiles were conditioned on the month of the year and typical power exchange with the grid. Moreover, the quality of generations was both visually and statistically evaluated. The experimental results demonstrate our proposed CVAE model can capture temporal features of historical load profiles and generate `realistic' data with satisfying univariate distributions and multivariate dependencies.
[[2209.04027] Cross-Modal Knowledge Transfer Without Task-Relevant Source Data](http://arxiv.org/abs/2209.04027)
Cost-effective depth and infrared sensors as alternatives to usual RGB sensors are now a reality, and have some advantages over RGB in domains like autonomous navigation and remote sensing. As such, building computer vision and deep learning systems for depth and infrared data are crucial. However, large labeled datasets for these modalities are still lacking. In such cases, transferring knowledge from a neural network trained on a well-labeled large dataset in the source modality (RGB) to a neural network that works on a target modality (depth, infrared, etc.) is of great value. For reasons like memory and privacy, it may not be possible to access the source data, and knowledge transfer needs to work with only the source models. We describe an effective solution, SOCKET: SOurce-free Cross-modal KnowledgE Transfer for this challenging task of transferring knowledge from one source modality to a different target modality without access to task-relevant source data. The framework reduces the modality gap using paired task-irrelevant data, as well as by matching the mean and variance of the target features with the batch-norm statistics that are present in the source models. We show through extensive experiments that our method significantly outperforms existing source-free methods for classification tasks which do not account for the modality gap.
[[2209.04030] Uncovering the Connection Between Differential Privacy and Certified Robustness of Federated Learning against Poisoning Attacks](http://arxiv.org/abs/2209.04030)
Federated learning (FL) provides an efficient paradigm to jointly train a global model leveraging data from distributed users. As the local training data come from different users who may not be trustworthy, several studies have shown that FL is vulnerable to poisoning attacks. Meanwhile, to protect the privacy of local users, FL is always trained in a differentially private way (DPFL). Thus, in this paper, we ask: Can we leverage the innate privacy property of DPFL to provide certified robustness against poisoning attacks? Can we further improve the privacy of FL to improve such certification? We first investigate both user-level and instance-level privacy of FL and propose novel mechanisms to achieve improved instance-level privacy. We then provide two robustness certification criteria: certified prediction and certified attack cost for DPFL on both levels. Theoretically, we prove the certified robustness of DPFL under a bounded number of adversarial users or instances. Empirically, we conduct extensive experiments to verify our theories under a range of attacks on different datasets. We show that DPFL with a tighter privacy guarantee always provides stronger robustness certification in terms of certified attack cost, but the optimal certified prediction is achieved under a proper balance between privacy protection and utility loss.
[[2209.04053] Algorithms with More Granular Differential Privacy Guarantees](http://arxiv.org/abs/2209.04053)
Differential privacy is often applied with a privacy parameter that is larger than the theory suggests is ideal; various informal justifications for tolerating large privacy parameters have been proposed. In this work, we consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis. In this framework, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person (i.e., all the attributes).
[[2209.04379] Minimizing Information Leakage under Padding Constraints](http://arxiv.org/abs/2209.04379)
An attacker can gain information of a user by analyzing its network traffic. The size of transferred data leaks information about the file being transferred or the service being used, and this is particularly revealing when the attacker has background knowledge about the files or services available for transfer. To prevent this, servers may pad their files using a padding scheme, changing the file sizes and preventing anyone from guessing their identity uniquely. This work focuses on finding optimal padding schemes that keep a balance between privacy and the costs of bandwidth increase. We consider R\'enyi-min leakage as our main measure for privacy, since it is directly related with the success of a simple attacker, and compare our algorithms with an existing solution that minimizes Shannon leakage. We provide improvements to our algorithms in order to optimize average total padding and Shannon leakage while minimizing R\'enyi-min leakage. Moreover, our algorithms are designed to handle a more general and important scenario in which multiple servers wish to compute padding schemes in a way that protects the servers' identity in addition to the identity of the files.
[[2209.04419] Majority Vote for Distributed Differentially Private Sign Selection](http://arxiv.org/abs/2209.04419)
Privacy-preserving data analysis has become prevailing in recent years. In this paper, we propose a distributed group differentially private majority vote mechanism for the sign selection problem in a distributed setup. To achieve this, we apply the iterative peeling to the stability function and use the exponential mechanism to recover the signs. As applications, we study the private sign selection for mean estimation and linear regression problems in distributed systems. Our method recovers the support and signs with the optimal signal-to-noise ratio as in the non-private scenario, which is better than contemporary works of private variable selections. Moreover, the sign selection consistency is justified with theoretical guarantees. Simulation studies are conducted to demonstrate the effectiveness of our proposed method.
[[2209.04022] Privacy of Autonomous Vehicles: Risks, Protection Methods, and Future Directions](http://arxiv.org/abs/2209.04022)
Recent advances in machine learning have enabled its wide application in different domains, and one of the most exciting applications is autonomous vehicles (AVs), which have encouraged the development of a number of ML algorithms from perception to prediction to planning. However, training AVs usually requires a large amount of training data collected from different driving environments (e.g., cities) as well as different types of personal information (e.g., working hours and routes). Such collected large data, treated as the new oil for ML in the data-centric AI era, usually contains a large amount of privacy-sensitive information which is hard to remove or even audit. Although existing privacy protection approaches have achieved certain theoretical and empirical success, there is still a gap when applying them to real-world applications such as autonomous vehicles. For instance, when training AVs, not only can individually identifiable information reveal privacy-sensitive information, but also population-level information such as road construction within a city, and proprietary-level commercial secrets of AVs. Thus, it is critical to revisit the frontier of privacy risks and corresponding protection approaches in AVs to bridge this gap. Following this goal, in this work, we provide a new taxonomy for privacy risks and protection methods in AVs, and we categorize privacy in AVs into three levels: individual, population, and proprietary. We explicitly list out recent challenges to protect each of these levels of privacy, summarize existing solutions to these challenges, discuss the lessons and conclusions, and provide potential future directions and opportunities for both researchers and practitioners. We believe this work will help to shape the privacy research in AV and guide the privacy protection technology design.
[[2209.04354] On Specification-based Cyber-Attack Detection in Smart Grids](http://arxiv.org/abs/2209.04354)
The transformation of power grids into intelligent cyber-physical systems brings numerous benefits, but also significantly increases the surface for cyber-attacks, demanding appropriate countermeasures. However, the development, validation, and testing of data-driven countermeasures against cyber-attacks, such as machine learning-based detection approaches, lack important data from real-world cyber incidents. Unlike attack data from real-world cyber incidents, infrastructure knowledge and standards are accessible through expert and domain knowledge. Our proposed approach uses domain knowledge to define the behavior of a smart grid under non-attack conditions and detect attack patterns and anomalies. Using a graph-based specification formalism, we combine cross-domain knowledge that enables the generation of whitelisting rules not only for statically defined protocol fields but also for communication ows and technical operation boundaries. Finally, we evaluate our specification-based intrusion detection system against various attack scenarios and assess detection quality and performance. In particular, we investigate a data manipulation attack in a future-orientated use case of an IEC 60870-based SCADA system that controls distributed energy resources in the distribution grid. Our approach can detect severe data manipulation attacks with high accuracy in a timely and reliable manner.
[[2209.04093] Learning Audio-Visual embedding for Wild Person Verification](http://arxiv.org/abs/2209.04093)
It has already been observed that audio-visual embedding can be extracted from these two modalities to gain robustness for person verification. However, the aggregator that used to generate a single utterance representation from each frame does not seem to be well explored. In this article, we proposed an audio-visual network that considers aggregator from a fusion perspective. We introduced improved attentive statistics pooling for the first time in face verification. Then we find that strong correlation exists between modalities during pooling, so joint attentive pooling is proposed which contains cycle consistency to learn the implicit inter-frame weight. Finally, fuse the modality with a gated attention mechanism. All the proposed models are trained on the VoxCeleb2 dev dataset and the best system obtains 0.18\%, 0.27\%, and 0.49\% EER on three official trail lists of VoxCeleb1 respectively, which is to our knowledge the best-published results for person verification. As an analysis, visualization maps are generated to explain how this system interact between modalities.
[[2209.04097] MassMIND: Massachusetts Maritime INfrared Dataset](http://arxiv.org/abs/2209.04097)
Recent advances in deep learning technology have triggered radical progress in the autonomy of ground vehicles. Marine coastal Autonomous Surface Vehicles (ASVs) that are regularly used for surveillance, monitoring and other routine tasks can benefit from this autonomy. Long haul deep sea transportation activities are additional opportunities. These two use cases present very different terrains -- the first being coastal waters -- with many obstacles, structures and human presence while the latter is mostly devoid of such obstacles. Variations in environmental conditions are common to both terrains. Robust labeled datasets mapping such terrains are crucial in improving the situational awareness that can drive autonomy. However, there are only limited such maritime datasets available and these primarily consist of optical images. Although, Long Wave Infrared (LWIR) is a strong complement to the optical spectrum that helps in extreme light conditions, a labeled public dataset with LWIR images does not currently exist. In this paper, we fill this gap by presenting a labeled dataset of over 2,900 LWIR segmented images captured in coastal maritime environment under diverse conditions. The images are labeled using instance segmentation and classified in seven categories -- sky, water, obstacle, living obstacle, bridge, self and background. We also evaluate this dataset across three deep learning architectures (UNet, PSPNet, DeepLabv3) and provide detailed analysis of its efficacy. While the dataset focuses on the coastal terrain it can equally help deep sea use cases. Such terrain would have less traffic, and the classifier trained on cluttered environment would be able to handle sparse scenes effectively. We share this dataset with the research community with the hope that it spurs new scene understanding capabilities in the maritime environment.
[[2209.04113] Robust and Lossless Fingerprinting of Deep Neural Networks via Pooled Membership Inference](http://arxiv.org/abs/2209.04113)
Deep neural networks (DNNs) have already achieved great success in a lot of application areas and brought profound changes to our society. However, it also raises new security problems, among which how to protect the intellectual property (IP) of DNNs against infringement is one of the most important yet very challenging topics. To deal with this problem, recent studies focus on the IP protection of DNNs by applying digital watermarking, which embeds source information and/or authentication data into DNN models by tuning network parameters directly or indirectly. However, tuning network parameters inevitably distorts the DNN and therefore surely impairs the performance of the DNN model on its original task regardless of the degree of the performance degradation. It has motivated the authors in this paper to propose a novel technique called \emph{pooled membership inference (PMI)} so as to protect the IP of the DNN models. The proposed PMI neither alters the network parameters of the given DNN model nor fine-tunes the DNN model with a sequence of carefully crafted trigger samples. Instead, it leaves the original DNN model unchanged, but can determine the ownership of the DNN model by inferring which mini-dataset among multiple mini-datasets was once used to train the target DNN model, which differs from previous arts and has remarkable potential in practice. Experiments also have demonstrated the superiority and applicability of this work.
[[2209.04278] Deep learning-based Crop Row Following for Infield Navigation of Agri-Robots](http://arxiv.org/abs/2209.04278)
Autonomous navigation in agricultural environments is often challenged by varying field conditions that may arise in arable fields. The state-of-the-art solutions for autonomous navigation in these agricultural environments will require expensive hardware such as RTK-GPS. This paper presents a robust crop row detection algorithm that can withstand those variations while detecting crop rows for visual servoing. A dataset of sugar beet images was created with 43 combinations of 11 field variations found in arable fields. The novel crop row detection algorithm is tested both for the crop row detection performance and also the capability of visual servoing along a crop row. The algorithm only uses RGB images as input and a convolutional neural network was used to predict crop row masks. Our algorithm outperformed the baseline method which uses colour-based segmentation for all the combinations of field variations. We use a combined performance indicator that accounts for the angular and displacement errors of the crop row detection. Our algorithm exhibited the worst performance during the early growth stages of the crop.
[[2209.04026] SPIDER: A Practical Fuzzing Framework to Uncover Stateful Performance Issues in SDN Controllers](http://arxiv.org/abs/2209.04026)
Performance issues in software-defined network (SDN) controllers can have serious impacts on the performance and availability of networks. We specifically consider stateful performance issues, where a sequence of initial input messages drives an SDN controller into a state such that its performance degrades pathologically when processing subsequent messages. We identify key challenges in applying canonical program analysis techniques: large input space of messages (e.g., stateful OpenFlow protocol), complex code base and software architecture (e.g., OSGi framework with dynamic launch), and the semantic dependencies between the internal state and external inputs. We design SPIDER, a practical fuzzing workflow that tackles these challenges and automatically uncovers such issues in SDN controllers. SPIDER's design entails a careful synthesis and extension of semantic fuzzing, performance fuzzing, and static analysis, taken together with domain-specific insights to tackle these challenges. We show that our design workflow is robust across two controllers -- ONOS and OpenDaylight -- with very different internal implementations. Using SPIDER, we were able to identify and confirm multiple stateful performance issues.
[[2209.04067] RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk](http://arxiv.org/abs/2209.04067)
Prior work on safe Reinforcement Learning (RL) has studied risk-aversion to randomness in dynamics (aleatory) and to model uncertainty (epistemic) in isolation. We propose and analyze a new framework to jointly model the risk associated with epistemic and aleatory uncertainties in finite-horizon and discounted infinite-horizon MDPs. We call this framework that combines Risk-Averse and Soft-Robust methods RASR. We show that when the risk-aversion is defined using either EVaR or the entropic risk, the optimal policy in RASR can be computed efficiently using a new dynamic program formulation with a time-dependent risk level. As a result, the optimal risk-averse policies are deterministic but time-dependent, even in the infinite-horizon discounted setting. We also show that particular RASR objectives reduce to risk-averse RL with mean posterior transition probabilities. Our empirical results show that our new algorithms consistently mitigate uncertainty as measured by EVaR and other standard risk measures.
[[2209.04187] Efficient Multi-view Clustering via Unified and Discrete Bipartite Graph Learning](http://arxiv.org/abs/2209.04187)
Although previous graph-based multi-view clustering algorithms have gained significant progress, most of them are still faced with three limitations. First, they often suffer from high computational complexity, which restricts their applications in large-scale scenarios. Second, they usually perform graph learning either at the single-view level or at the view-consensus level, but often neglect the possibility of the joint learning of single-view and consensus graphs. Third, many of them rely on the $k$-means for discretization of the spectral embeddings, which lack the ability to directly learn the graph with discrete cluster structure. In light of this, this paper presents an efficient multi-view clustering approach via unified and discrete bipartite graph learning (UDBGL). Specifically, the anchor-based subspace learning is incorporated to learn the view-specific bipartite graphs from multiple views, upon which the bipartite graph fusion is leveraged to learn a view-consensus bipartite graph with adaptive weight learning. Further, the Laplacian rank constraint is imposed to ensure that the fused bipartite graph has discrete cluster structures (with a specific number of connected components). By simultaneously formulating the view-specific bipartite graph learning, the view-consensus bipartite graph learning, and the discrete cluster structure learning into a unified objective function, an efficient minimization algorithm is then designed to tackle this optimization problem and directly achieve a discrete clustering solution without requiring additional partitioning, which notably has linear time complexity in data size. Experiments on a variety of multi-view datasets demonstrate the robustness and efficiency of our UDBGL approach.
[[2209.04254] Shapley value-based approaches to explain the robustness of classifiers in machine learning](http://arxiv.org/abs/2209.04254)
In machine learning, the use of algorithm-agnostic approaches is an emerging area of research for explaining the contribution of individual features towards the predicted outcome. Whilst there is a focus on explaining the prediction itself, a little has been done on explaining the robustness of these models, that is, how each feature contributes towards achieving that robustness. In this paper, we propose the use of Shapley values to explain the contribution of each feature towards the model's robustness, measured in terms of Receiver-operating Characteristics (ROC) curve and the Area under the ROC curve (AUC). With the help of an illustrative example, we demonstrate the proposed idea of explaining the ROC curve, and visualising the uncertainties in these curves. For imbalanced datasets, the use of Precision-Recall Curve (PRC) is considered more appropriate, therefore we also demonstrate how to explain the PRCs with the help of Shapley values.
[[2209.04293] Robust-by-Design Classification via Unitary-Gradient Neural Networks](http://arxiv.org/abs/2209.04293)
The use of neural networks in safety-critical systems requires safe and robust models, due to the existence of adversarial attacks. Knowing the minimal adversarial perturbation of any input x, or, equivalently, knowing the distance of x from the classification boundary, allows evaluating the classification robustness, providing certifiable predictions. Unfortunately, state-of-the-art techniques for computing such a distance are computationally expensive and hence not suited for online applications. This work proposes a novel family of classifiers, namely Signed Distance Classifiers (SDCs), that, from a theoretical perspective, directly output the exact distance of x from the classification boundary, rather than a probability score (e.g., SoftMax). SDCs represent a family of robust-by-design classifiers. To practically address the theoretical requirements of a SDC, a novel network architecture named Unitary-Gradient Neural Network is presented. Experimental results show that the proposed architecture approximates a signed distance classifier, hence allowing an online certifiable classification of x at the cost of a single inference.
[[2209.04163] Estimating Multi-label Accuracy using Labelset Distributions](http://arxiv.org/abs/2209.04163)
A multi-label classifier estimates the binary label state (relevant vs irrelevant) for each of a set of concept labels, for any given instance. Probabilistic multi-label classifiers provide a predictive posterior distribution over all possible labelset combinations of such label states (the powerset of labels) from which we can provide the best estimate, simply by selecting the labelset corresponding to the largest expected accuracy, over that distribution. For example, in maximizing exact match accuracy, we provide the mode of the distribution. But how does this relate to the confidence we may have in such an estimate? Confidence is an important element of real-world applications of multi-label classifiers (as in machine learning in general) and is an important ingredient in explainability and interpretability. However, it is not obvious how to provide confidence in the multi-label context and relating to a particular accuracy metric, and nor is it clear how to provide a confidence which correlates well with the expected accuracy, which would be most valuable in real-world decision making. In this article we estimate the expected accuracy as a surrogate for confidence, for a given accuracy metric. We hypothesise that the expected accuracy can be estimated from the multi-label predictive distribution. We examine seven candidate functions for their ability to estimate expected accuracy from the predictive distribution. We found three of these to correlate to expected accuracy and are robust. Further, we determined that each candidate function can be used separately to estimate Hamming similarity, but a combination of the candidates was best for expected Jaccard index and exact match.
[[2209.04112] Joint Alignment of Multi-Task Feature and Label Spaces for Emotion Cause Pair Extraction](http://arxiv.org/abs/2209.04112)
Emotion cause pair extraction (ECPE), as one of the derived subtasks of emotion cause analysis (ECA), shares rich inter-related features with emotion extraction (EE) and cause extraction (CE). Therefore EE and CE are frequently utilized as auxiliary tasks for better feature learning, modeled via multi-task learning (MTL) framework by prior works to achieve state-of-the-art (SoTA) ECPE results. However, existing MTL-based methods either fail to simultaneously model the specific features and the interactive feature in between, or suffer from the inconsistency of label prediction. In this work, we consider addressing the above challenges for improving ECPE by performing two alignment mechanisms with a novel A^2Net model. We first propose a feature-task alignment to explicitly model the specific emotion-&cause-specific features and the shared interactive feature. Besides, an inter-task alignment is implemented, in which the label distance between the ECPE and the combinations of EE&CE are learned to be narrowed for better label consistency. Evaluations of benchmarks show that our methods outperform current best-performing systems on all ECA subtasks. Further analysis proves the importance of our proposed alignment mechanisms for the task.
[[2209.04418] Trustworthy Federated Learning via Blockchain](http://arxiv.org/abs/2209.04418)
The safety-critical scenarios of artificial intelligence (AI), such as autonomous driving, Internet of Things, smart healthcare, etc., have raised critical requirements of trustworthy AI to guarantee the privacy and security with reliable decisions. As a nascent branch for trustworthy AI, federated learning (FL) has been regarded as a promising privacy preserving framework for training a global AI model over collaborative devices. However, security challenges still exist in the FL framework, e.g., Byzantine attacks from malicious devices, and model tampering attacks from malicious server, which will degrade or destroy the accuracy of trained global AI model. In this paper, we shall propose a decentralized blockchain based FL (B-FL) architecture by using a secure global aggregation algorithm to resist malicious devices, and deploying practical Byzantine fault tolerance consensus protocol with high effectiveness and low energy consumption among multiple edge servers to prevent model tampering from the malicious server. However, to implement B-FL system at the network edge, multiple rounds of cross-validation in blockchain consensus protocol will induce long training latency. We thus formulate a network optimization problem that jointly considers bandwidth and power allocation for the minimization of long-term average training latency consisting of progressive learning rounds. We further propose to transform the network optimization problem as a Markov decision process and leverage the deep reinforcement learning based algorithm to provide high system performance with low computational complexity. Simulation results demonstrate that B-FL can resist malicious attacks from edge devices and servers, and the training latency of B-FL can be significantly reduced by deep reinforcement learning based algorithm compared with baseline algorithms.
[[2209.04007] FedDAR: Federated Domain-Aware Representation Learning](http://arxiv.org/abs/2209.04007)
Cross-silo Federated learning (FL) has become a promising tool in machine learning applications for healthcare. It allows hospitals/institutions to train models with sufficient data while the data is kept private. To make sure the FL model is robust when facing heterogeneous data among FL clients, most efforts focus on personalizing models for clients. However, the latent relationships between clients' data are ignored. In this work, we focus on a special non-iid FL problem, called Domain-mixed FL, where each client's data distribution is assumed to be a mixture of several predefined domains. Recognizing the diversity of domains and the similarity within domains, we propose a novel method, FedDAR, which learns a domain shared representation and domain-wise personalized prediction heads in a decoupled manner. For simplified linear regression settings, we have theoretically proved that FedDAR enjoys a linear convergence rate. For general settings, we have performed intensive empirical studies on both synthetic and real-world medical datasets which demonstrate its superiority over prior FL methods.
[[2209.04184] Anomaly Detection through Unsupervised Federated Learning](http://arxiv.org/abs/2209.04184)
Federated learning (FL) is proving to be one of the most promising paradigms for leveraging distributed resources, enabling a set of clients to collaboratively train a machine learning model while keeping the data decentralized. The explosive growth of interest in the topic has led to rapid advancements in several core aspects like communication efficiency, handling non-IID data, privacy, and security capabilities. However, the majority of FL works only deal with supervised tasks, assuming that clients' training sets are labeled. To leverage the enormous unlabeled data on distributed edge devices, in this paper, we aim to extend the FL paradigm to unsupervised tasks by addressing the problem of anomaly detection in decentralized settings. In particular, we propose a novel method in which, through a preprocessing phase, clients are grouped into communities, each having similar majority (i.e., inlier) patterns. Subsequently, each community of clients trains the same anomaly detection model (i.e., autoencoders) in a federated fashion. The resulting model is then shared and used to detect anomalies within the clients of the same community that joined the corresponding federated process. Experiments show that our method is robust, and it can detect communities consistent with the ideal partitioning in which groups of clients having the same inlier patterns are known. Furthermore, the performance is significantly better than those in which clients train models exclusively on local data and comparable with federated models of ideal communities' partition.
[[2209.04230] Survey on Deep Fuzzy Systems in regression applications: a view on interpretability](http://arxiv.org/abs/2209.04230)
Regression problems have been more and more embraced by deep learning (DL) techniques. The increasing number of papers recently published in this domain, including surveys and reviews, shows that deep regression has captured the attention of the community due to efficiency and good accuracy in systems with high-dimensional data. However, many DL methodologies have complex structures that are not readily transparent to human users. Accessing the interpretability of these models is an essential factor for addressing problems in sensitive areas such as cyber-security systems, medical, financial surveillance, and industrial processes. Fuzzy logic systems (FLS) are inherently interpretable models, well known in the literature, capable of using nonlinear representations for complex systems through linguistic terms with membership degrees mimicking human thought. Within an atmosphere of explainable artificial intelligence, it is necessary to consider a trade-off between accuracy and interpretability for developing intelligent models. This paper aims to investigate the state-of-the-art on existing methodologies that combine DL and FLS, namely deep fuzzy systems, to address regression problems, configuring a topic that is currently not sufficiently explored in the literature and thus deserves a comprehensive survey.