[[2212.12439] Permissionless Refereed Tournaments](http://arxiv.org/abs/2212.12439) #secure
Scalability problems in programmable blockchains have created a strong demand for secure methods that move the bulk of computation outside the blockchain. One of the preferred solutions to this problem involves off-chain computers that compete interactively to prove to the limited blockchain that theirs is the correct result of a given intensive computation. Each off-chain computer spends effort linear on the cost of the computation, while the blockchain adjudicates disputes spending only logarithmic effort. However, this effort is multiplied by the number of competitors, rendering disputes that involve a significant number of parties impractical and susceptible to Sybil attacks. In this paper, we propose a practical dispute resolution algorithm by which a single honest competitor can win disputes while spending effort linear on the cost of the computation, but only logarithmic on the number of dishonest competitors. This algorithm is a novel, stronger primitive for building permissionless fraud-proof protocols, which doesn't rely on complex economic incentives to be enforced.
[[2212.12101] Security and Interpretability in Automotive Systems](http://arxiv.org/abs/2212.12101) #security
The lack of any sender authentication mechanism in place makes CAN (Controller Area Network) vulnerable to security threats. For instance, an attacker can impersonate an ECU (Electronic Control Unit) on the bus and send spoofed messages unobtrusively with the identifier of the impersonated ECU. To address the insecure nature of the system, this thesis demonstrates a sender authentication technique that uses power consumption measurements of the electronic control units (ECUs) and a classification model to determine the transmitting states of the ECUs. The method's evaluation in real-world settings shows that the technique applies in a broad range of operating conditions and achieves good accuracy.
A key challenge of machine learning-based security controls is the potential of false positives. A false-positive alert may induce panic in operators, lead to incorrect reactions, and in the long run cause alarm fatigue. For reliable decision-making in such a circumstance, knowing the cause for unusual model behavior is essential. But, the black-box nature of these models makes them uninterpretable. Therefore, another contribution of this thesis explores explanation techniques for inputs of type image and time series that (1) assign weights to individual inputs based on their sensitivity toward the target class, (2) and quantify the variations in the explanation by reconstructing the sensitive regions of the inputs using a generative model.
In summary, this thesis (https://uwspace.uwaterloo.ca/handle/10012/18134) presents methods for addressing the security and interpretability in automotive systems, which can also be applied in other settings where safe, transparent, and reliable decision-making is crucial.
[[2212.12178] COVID Down Under: where did Australia's pandemic apps go wrong?](http://arxiv.org/abs/2212.12178) #security
Governments and businesses worldwide deployed a variety of technological measures to help prevent and track the spread of COVID-19. In Australia, these applications contained usability, accessibility, and security flaws that hindered their effectiveness and adoption. Australia, like most countries, has transitioned to treating COVID as endemic. However it is yet to absorb lessons from the technological issues with its approach to the pandemic. In this short paper we provide a systematization of the most notable events; identify and review different failure modes of these applications; and develop recommendations for developing apps in the face of future crises. Our work focuses on a single country. However, Australia's issues are particularly instructive as they highlight surprisingly pitfalls that countries should address in the face of a future pandemic.
[[2212.12300] Matrix Based Adaptive Short Block Cipher](http://arxiv.org/abs/2212.12300) #security
Every day, millions of credit cards are swiped and transactions are carried out across the world. Due to numerous forms of unethical digital activities, users are vulnerable to credit card fraud, phishing, identity theft, etc. This paper outlines a novel block encryption algorithm involving multiple private keys and a resilient trapdoor function that ensures data security while maintaining an optimal run time and space complexity. The proposed scheme consists of an irrepressible trapdoor based on a depressed cubic function and a unique key generation algorithm that uses Fibonacci sequences and invertible square matrices for improved security. The paper involves data obtained from comprehensive cryptanalysis exploiting the strengths and weaknesses of the system and comments on its potential large-scale industry applications.
[[2212.12307] Defending against cybersecurity threats to the payments and banking system](http://arxiv.org/abs/2212.12307) #security
Cyber security threats to the payment and banking system have become a worldwide menace. The phenomenon has forced financial institutions to take risks as part of their business model. Hence, deliberate investment in sophisticated technologies and security measures has become imperative to safeguard against heavy financial losses and information breaches that may occur due to cyber-attacks. The proliferation of cyber crimes is a huge concern for various stakeholders in the banking sector. Usually, cyber-attacks are carried out via software systems running on a computing system in cyberspace. As such, to prevent risks of cyber-attacks on software systems, entities operating within cyberspace must be identified and the threats to the application security isolated after analyzing the vulnerabilities and developing defense mechanisms. This paper will examine various approaches that identify assets in cyberspace, classify the cyber threats, provide security defenses and map security measures to control types and functionalities. Thus, adopting the right application to the security threats and defenses will aid IT professionals and users alike in making decisions for developing a strong defense-in-depth mechanism.
[[2212.12308] Evaluation of Static Analysis on Web Applications](http://arxiv.org/abs/2212.12308) #security
Web services are becoming business-critical components, often deployed with critical software bugs that can be maliciously explored. Web vulnerability scanners allow the detection of security vulnerabilities in web services by stressing the service from the point of view of an attacker. However, research and practice show that different scanners perform differently in vulnerability detection. This paper presents a qualitative evaluation of security vulnerabilities found in web applications. Some well-known vulnerability scanners have been used to identify security flaws in web service implementations. Many vulnerabilities have been observed, which confirms that many services are deployed without proper security testing. Additionally, having reviewed and considered several articles, the differences in the vulnerabilities detected and the high number of false positives observed highlight the limitations of web vulnerability scanners in detecting security vulnerabilities in web services. Furthermore, this work will discuss the static analysis approach for discovering security vulnerabilities in web applications and complimenting it with proven research findings or solutions. These vulnerabilities include broken access control, cross-site scripting, SQL injections, buffer overflow, unrestricted file upload, broken authentications, etc. Web applications are becoming mission-essential components for businesses, potentially risking having several software vulnerabilities that hackers can exploit maliciously. A few Vulnerability scanners have been used to detect security weaknesses in web service applications, and many vulnerabilities have been discovered, thus confirming that many online apps are launched without sufficient security testing.
[[2212.12495] Clones of the Unclonable: Nanoduplicating Optical PUFs and Applications](http://arxiv.org/abs/2212.12495) #security
Physical unclonable functions (PUFs), physical objects that are practically unclonable because of their andom and uncontrollable manufacturing variations, are becoming increasingly popular as security primitives and unique identifiers in a fully digitized world. One of the central PUF premises states that both friends and foes, both legitimate manufacturers and external attackers alike, cannot clone a PUF, producing two instances that are the same. Using the latest nanofabrication techniques, we show that this premise is not always met: We demonstrate the possibility of effective PUF duplication through sophisticated manufacturers by producing 63 copies of a non-trivial optical scattering structure which exhibit essentially the same scattering behavior. The remaining minuscule differences are close to or below noise levels, whence the duplicates have to be considered fully equivalent from a PUF perspective. The possibility for manufacturer-based optical PUF duplication has positive and negative consequences at the same time: While fully breaking the security of certain schemes, it enables new applications, too. For example, it facilitates unforgeable labels for valuable items; the first key-free group identification schemes over digital networks; or new types of encryption/decryption devices that do not contain secret keys.
[[2212.12534] A Privacy-Preserving Model based on Differential Approach for Sensitive Data in Cloud Environment](http://arxiv.org/abs/2212.12534) #privacy
A large amount of data and applications need to be shared with various parties and stakeholders in the cloud environment for storage, computation, and data utilization. Since a third party operates the cloud platform, owners cannot fully trust this environment. However, it has become a challenge to ensure privacy preservation when sharing data effectively among different parties. This paper proposes a novel model that partitions data into sensitive and non-sensitive parts, injects the noise into sensitive data, and performs classification tasks using k-anonymization, differential privacy, and machine learning approaches. It allows multiple owners to share their data in the cloud environment for various purposes. The model specifies communication protocol among involved multiple untrusted parties to process owners data. The proposed model preserves actual data by providing a robust mechanism. The experiments are performed over Heart Disease, Arrhythmia, Hepatitis, Indian-liver-patient, and Framingham datasets for Support Vector Machine, K-Nearest Neighbor, Random Forest, Naive Bayes, and Artificial Neural Network classifiers to compute the efficiency in terms of accuracy, precision, recall, and F1-score of the proposed model. The achieved results provide high accuracy, precision, recall, and F1-score up to 93.75%, 94.11%, 100%, and 87.99% and improvement up to 16%, 29%, 12%, and 11%, respectively, compared to previous works.
[[2212.12306] A comparison, analysis, and provision of methods in identifying types of malware and means of malware detection and protection against them](http://arxiv.org/abs/2212.12306) #protect
In this research paper, our intent is to outline different types of malware, their means of operation, and how they are detected in order to protect yourself against such attacks. Varied permission, and limited technical resources mean that detecting malware and such attacks becomes more difficult. With the normal user being limited to the UI, their ability to see what happens in the background is virtually limited to none. Many do not have control on how they distribute permission over the data the applications they use controls, or how that data is stored or distributed. They also do not receive any notification as to whether their data is protected against various attacks and if it has not been attacked already. In this paper, we present evidence on what malware is, how malware operates, different types of malware, and the general means of defence.
[[2212.12309] How Cyber Criminal Use Social Engineering To Target Organizations](http://arxiv.org/abs/2212.12309) #attack
Social engineering is described as the art of manipulation. Cybercriminal use manipulation to victims their targets using psychological principles to change their behavior to make unconscious decisions. This study identifies the attack and techniques used by cybercriminal to conduct social engineering attacks within an organization. This study evaluate how social engineering attacks are delivered, techniques used and highlights how attackers take advantage Compromised systems. Lastly this study will also evaluate and provide the best solutions to help mitigate social engineering attacks with an organization
[[2212.12204] EndoBoost: a plug-and-play module for false positive suppression during computer-aided polyp detection in real-world colonoscopy (with dataset)](http://arxiv.org/abs/2212.12204) #robust
The advance of computer-aided detection systems using deep learning opened a new scope in endoscopic image analysis. However, the learning-based models developed on closed datasets are susceptible to unknown anomalies in complex clinical environments. In particular, the high false positive rate of polyp detection remains a major challenge in clinical practice. In this work, we release the FPPD-13 dataset, which provides a taxonomy and real-world cases of typical false positives during computer-aided polyp detection in real-world colonoscopy. We further propose a post-hoc module EndoBoost, which can be plugged into generic polyp detection models to filter out false positive predictions. This is realized by generative learning of the polyp manifold with normalizing flows and rejecting false positives through density estimation. Compared to supervised classification, this anomaly detection paradigm achieves better data efficiency and robustness in open-world settings. Extensive experiments demonstrate a promising false positive suppression in both retrospective and prospective validation. In addition, the released dataset can be used to perform 'stress' tests on established detection systems and encourages further research toward robust and reliable computer-aided endoscopic image analysis. The dataset and code will be publicly available at this http URL
[[2212.12411] Benchmark for Uncertainty & Robustness in Self-Supervised Learning](http://arxiv.org/abs/2212.12411) #robust
Self-Supervised Learning (SSL) is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars. In addition to a lack of labeled data, these applications also suffer from distributional shifts. Therefore, an SSL method should provide robust generalization and uncertainty estimation in the test dataset to be considered a reliable model in such high-stakes domains. However, existing approaches often focus on generalization, without evaluating the model's uncertainty. The ability to compare SSL techniques for improving these estimates is therefore critical for research on the reliability of self-supervision models. In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks. We train SSL in auxiliary learning for vision and pre-training for language model, then evaluate the generalization (in-out classification accuracy) and uncertainty (expected calibration error) across different distribution covariate shift datasets, including MNIST-C, CIFAR-10-C, CIFAR-10.1, and MNLI. Our goal is to create a benchmark with outputs from experiments, providing a starting point for new SSL methods in Reliable Machine Learning. All source code to reproduce results is available at https://github.com/hamanhbui/reliable_ssl_baselines.
[[2212.12092] Anomaly Detection using Ensemble Classification and Evidence Theory](http://arxiv.org/abs/2212.12092) #robust
Multi-class ensemble classification remains a popular focus of investigation within the research community. The popularization of cloud services has sped up their adoption due to the ease of deploying large-scale machine-learning models. It has also drawn the attention of the industrial sector because of its ability to identify common problems in production. However, there are challenges to conform an ensemble classifier, namely a proper selection and effective training of the pool of classifiers, the definition of a proper architecture for multi-class classification, and uncertainty quantification of the ensemble classifier. The robustness and effectiveness of the ensemble classifier lie in the selection of the pool of classifiers, as well as in the learning process. Hence, the selection and the training procedure of the pool of classifiers play a crucial role. An (ensemble) classifier learns to detect the classes that were used during the supervised training. However, when injecting data with unknown conditions, the trained classifier will intend to predict the classes learned during the training. To this end, the uncertainty of the individual and ensemble classifier could be used to assess the learning capability. We present a novel approach for novel detection using ensemble classification and evidence theory. A pool selection strategy is presented to build a solid ensemble classifier. We present an architecture for multi-class ensemble classification and an approach to quantify the uncertainty of the individual classifiers and the ensemble classifier. We use uncertainty for the anomaly detection approach. Finally, we use the benchmark Tennessee Eastman to perform experiments to test the ensemble classifier's prediction and anomaly detection capabilities.
[[2212.12190] Look Around! A Neighbor Relation Graph Learning Framework for Real Estate Appraisal](http://arxiv.org/abs/2212.12190) #robust
Real estate appraisal is a crucial issue for urban applications, which aims to value the properties on the market. Traditional methods perform appraisal based on the domain knowledge, but suffer from the efforts of hand-crafted design. Recently, several methods have been developed to automatize the valuation process by taking the property trading transaction into account when estimating the property value. However, existing methods only consider the real estate itself, ignoring the relation between the properties. Moreover, naively aggregating the information of neighbors fails to model the relationships between the transactions. To tackle these limitations, we propose a novel Neighbor Relation Graph Learning Framework (ReGram) by incorporating the relation between target transaction and surrounding neighbors with the attention mechanism. To model the influence between communities, we integrate the environmental information and the past price of each transaction from other communities. Moreover, since the target transactions in different regions share some similarities and differences of characteristics, we introduce a dynamic adapter to model the different distributions of the target transactions based on the input-related kernel weights. Extensive experiments on the real-world dataset with various scenarios demonstrate that ReGram robustly outperforms the state-of-the-art methods. Furthermore, comprehensive ablation studies were conducted to examine the effectiveness of each component in ReGram.
[[2212.12121] Federated PCA on Grassmann Manifold for Anomaly Detection in IoT Networks](http://arxiv.org/abs/2212.12121) #federate
In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.
[[2212.12158] Graph Federated Learning with Hidden Representation Sharing](http://arxiv.org/abs/2212.12158) #federate
Learning on Graphs (LoG) is widely used in multi-client systems when each client has insufficient local data, and multiple clients have to share their raw data to learn a model of good quality. One scenario is to recommend items to clients with limited historical data and sharing similar preferences with other clients in a social network. On the other hand, due to the increasing demands for the protection of clients' data privacy, Federated Learning (FL) has been widely adopted: FL requires models to be trained in a multi-client system and restricts sharing of raw data among clients. The underlying potential data-sharing conflict between LoG and FL is under-explored and how to benefit from both sides is a promising problem. In this work, we first formulate the Graph Federated Learning (GFL) problem that unifies LoG and FL in multi-client systems and then propose sharing hidden representation instead of the raw data of neighbors to protect data privacy as a solution. To overcome the biased gradient problem in GFL, we provide a gradient estimation method and its convergence analysis under the non-convex objective. In experiments, we evaluate our method in classification tasks on graphs. Our experiment shows a good match between our theory and the practice.
[[2212.12191] Deep Unfolding-based Weighted Averaging for Federated Learning under Heterogeneous Environments](http://arxiv.org/abs/2212.12191) #federate
Federated learning is a collaborative model training method by iterating model updates at multiple clients and aggregation of the updates at a central server. Device and statistical heterogeneity of the participating clients cause performance degradation so that an appropriate weight should be assigned per client in the server's aggregation phase. This paper employs deep unfolding to learn the weights that adapt to the heterogeneity, which gives the model with high accuracy on uniform test data. The results of numerical experiments indicate the high performance of the proposed method and the interpretable behavior of the learned weights.
[[2212.12393] A-NeSI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference](http://arxiv.org/abs/2212.12393) #explainability
We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate inference. A-NeSI 1) performs approximate inference in polynomial time without changing the semantics of probabilistic logics; 2) is trained using data generated by the background knowledge; 3) can generate symbolic explanations of predictions; and 4) can guarantee the satisfaction of logical constraints at test time, which is vital in safety-critical applications. Our experiments show that A-NeSI is the first end-to-end method to scale the Multi-digit MNISTAdd benchmark to sums of 15 MNIST digits, up from 4 in competing systems. Finally, our experiments show that A-NeSI achieves explainability and safety without a penalty in performance.