secure

Title: Developing Hands-on Labs for Source Code Vulnerability Detection with AI. (arXiv:2302.00750v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.00750
Code URL: null
Copy Paste: [[2302.00750] Developing Hands-on Labs for Source Code Vulnerability Detection with AI](http://arxiv.org/abs/2302.00750) #secure
Summary:
As the role of information and communication technologies gradually increases in our lives, source code security becomes a significant issue to protect against malicious attempts Furthermore with the advent of data-driven techniques, there is now a growing interest in leveraging machine learning and natural language processing as a source code assurance method to build trustworthy systems Therefore training our future software developers to write secure source code is in high demand In this thesis we propose a framework including learning modules and hands on labs to guide future IT professionals towards developing secure programming habits and mitigating source code vulnerabilities at the early stages of the software development lifecycle In this thesis our goal is to design learning modules with a set of hands on labs that will introduce students to secure programming practices using source code and log file analysis tools to predict and identify vulnerabilities In a Secure Coding Education framework we will improve students skills and awareness on source code vulnerabilities detection tools and mitigation techniques integrate concepts of source code vulnerabilities from Function API and library level to bad programming habits and practices leverage deep learning NLP and static analysis tools for log file analysis to introduce the root cause of source code vulnerabilities

security

Title: SSO-Monitor: Fully-Automatic Large-Scale Landscape, Security, and Privacy Analyses of Single Sign-On in the Wild. (arXiv:2302.01024v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01024
Code URL: null
Copy Paste: [[2302.01024] SSO-Monitor: Fully-Automatic Large-Scale Landscape, Security, and Privacy Analyses of Single Sign-On in the Wild](http://arxiv.org/abs/2302.01024) #security
Summary:
Single Sign-On (SSO) shifts the crucial authentication process on a website to to the underlying SSO protocols and their correct implementation. To strengthen SSO security, organizations, such as IETF and W3C, maintain advisories to address known threats. One could assume that these security best practices are widely deployed on websites. We show that this assumption is a fallacy. We present SSO-MONITOR, an open-source fully-automatic large-scale SSO landscape, security, and privacy analysis tool. In contrast to all previous work, SSO-MONITOR uses a highly extensible, fully automated workflow with novel visual-based SSO detection techniques, enhanced security and privacy analyses, and continuously updated monitoring results. It receives a list of domains as input to discover the login pages, recognize the supported Identity Providers (IdPs), and execute the SSO. It further reveals the current security level of SSO in the wild compared to the security best practices on paper. With SSO-MONITOR, we automatically identified 1,632 websites with 3,020 Apple, Facebook, or Google logins within the Tranco 10k. Our continuous monitoring also revealed how quickly these numbers change over time. SSO-MONITOR can automatically login to each SSO website. It records the logins by tracing HTTP and in-browser communication to detect widespread security and privacy issues automatically. We introduce a novel deep-level inspection of HTTP parameters that we call SMARTPARMS. Using SMARTPARMS for security analyses, we uncovered URL parameters in 5 Client Application (Client) secret leakages and 337 cases with weak CSRF protection. We additionally identified 447 cases with no CSRF protection, 342 insecure SSO flows and 9 cases with nested URL parameters, leading to an open redirect in one case. SSO-MONITOR reveals privacy leakages that deanonymize users in 200 cases.

Title: Generalized Uncertainty Principles for Quantum Cryptography. (arXiv:2302.01026v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01026
Code URL: null
Copy Paste: [[2302.01026] Generalized Uncertainty Principles for Quantum Cryptography](http://arxiv.org/abs/2302.01026) #security
Summary:
We know the classical public cryptographic algorithms are based on certain NP-hard problems such as the integer factoring in RSA and the discrete logarithm in Diffie-Hellman. They are going to be vulnerable with fault-tolerant quantum computers. We also know that the uncertainty principle for quantum bits or qubits such as quantum key distribution or QKD based on the quantum uncertainty principle offers the information theoretical security. The interesting implication with the paradigm shifts from classical computing to quantum computing is that the NP-hardness used for classical cryptography may shift to the uncertainty principles for quantum cryptography including quantum symmetric encryption, post-quantum cryptography, as well as quantum encryption in phase space for coherent optical communications. This paper would like to explore those so-called generalized uncertainty principles and explain what their implications are for quantum security. We identified three generalized uncertainty principles offering quantum security: non-commutability between permutation gates, non-commutability between the displacement and phase shift operators for coherent states, and the modular Diophantine Equation Problem in general linear algebra for post-quantum cryptography.

Title: Fixing Hardware Security Bugs with Large Language Models. (arXiv:2302.01215v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01215
Code URL: null
Copy Paste: [[2302.01215] Fixing Hardware Security Bugs with Large Language Models](http://arxiv.org/abs/2302.01215) #security
Summary:
Novel AI-based code-writing Large Language Models (LLMs) such as OpenAI's Codex have demonstrated capabilities in many coding-adjacent domains. In this work we consider how LLMs maybe leveraged to automatically repair security relevant bugs present in hardware designs. We focus on bug repair in code written in the Hardware Description Language Verilog. For this study we build a corpus of domain-representative hardware security bugs. We then design and implement a framework to quantitatively evaluate the performance of any LLM tasked with fixing the specified bugs. The framework supports design space exploration of prompts (i.e., prompt engineering) and identifying the best parameters for the LLM. We show that an ensemble of LLMs can repair all ten of our benchmarks. This ensemble outperforms the state-of-the-art Cirfix hardware bug repair tool on its own suite of bugs. These results show that LLMs can repair hardware security bugs and the framework is an important step towards the ultimate goal of an automated end-to-end bug repair framework.

privacy

Title: Blocking JavaScript without Breaking the Web: An Empirical Investigation. (arXiv:2302.01182v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01182
Code URL: null
Copy Paste: [[2302.01182] Blocking JavaScript without Breaking the Web: An Empirical Investigation](http://arxiv.org/abs/2302.01182) #privacy
Summary:
Modern websites heavily rely on JavaScript (JS) to implement legitimate functionality as well as privacy-invasive advertising and tracking. Browser extensions such as NoScript block any script not loaded by a trusted list of endpoints, thus hoping to block privacy-invasive scripts while avoiding breaking legitimate website functionality. In this paper, we investigate whether blocking JS on the web is feasible without breaking legitimate functionality. To this end, we conduct a large-scale measurement study of JS blocking on 100K websites. We evaluate the effectiveness of different JS blocking strategies in tracking prevention and functionality breakage. Our evaluation relies on quantitative analysis of network requests and resource loads as well as manual qualitative analysis of visual breakage. First, we show that while blocking all scripts is quite effective at reducing tracking, it significantly degrades functionality on approximately two-thirds of the tested websites. Second, we show that selective blocking of a subset of scripts based on a curated list achieves a better tradeoff. However, there remain approximately 15% mixed scripts, which essentially merge tracking and legitimate functionality and thus cannot be blocked without causing website breakage. Finally, we show that fine-grained blocking of a subset of JS methods, instead of scripts, reduces major breakage by 3.7$\times$ while providing the same level of tracking prevention. Our work highlights the promise and open challenges in fine-grained JS blocking for tracking prevention without breaking the web.

Title: Privacy Risk for anisotropic Langevin dynamics using relative entropy bounds. (arXiv:2302.00766v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00766
Code URL: null
Copy Paste: [[2302.00766] Privacy Risk for anisotropic Langevin dynamics using relative entropy bounds](http://arxiv.org/abs/2302.00766) #privacy
Summary:
The privacy preserving properties of Langevin dynamics with additive isotropic noise have been extensively studied. However, the isotropic noise assumption is very restrictive: (a) when adding noise to existing learning algorithms to preserve privacy and maintain the best possible accuracy one should take into account the relative magnitude of the outputs and their correlations; (b) popular algorithms such as stochastic gradient descent (and their continuous time limits) appear to possess anisotropic covariance properties. To study the privacy risks for the anisotropic noise case, one requires general results on the relative entropy between the laws of two Stochastic Differential Equations with different drifts and diffusion coefficients. Our main contribution is to establish such a bound using stability estimates for solutions to the Fokker-Planck equations via functional inequalities. With additional assumptions, the relative entropy bound implies an $(\epsilon,\delta)$-differential privacy bound. We discuss the practical implications of our bound related to privacy risk in different contexts.Finally, the benefits of anisotropic noise are illustrated using numerical results on optimising a quadratic loss or calibrating a neural network.

Title: Fed-GLOSS-DP: Federated, Global Learning using Synthetic Sets with Record Level Differential Privacy. (arXiv:2302.01068v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01068
Code URL: null
Copy Paste: [[2302.01068] Fed-GLOSS-DP: Federated, Global Learning using Synthetic Sets with Record Level Differential Privacy](http://arxiv.org/abs/2302.01068) #privacy
Summary:
This work proposes Fed-GLOSS-DP, a novel approach to privacy-preserving learning that uses synthetic data to train federated models. In our approach, the server recovers an approximation of the global loss landscape in a local neighborhood based on synthetic samples received from the clients. In contrast to previous, point-wise, gradient-based, linear approximation (such as FedAvg), our formulation enables a type of global optimization that is particularly beneficial in non-IID federated settings. We also present how it rigorously complements record-level differential privacy. Extensive results show that our novel formulation gives rise to considerable improvements in terms of convergence speed and communication costs. We argue that our new approach to federated learning can provide a potential path toward reconciling privacy and accountability by sending differentially private, synthetic data instead of gradient updates. The source code will be released upon publication.

protect

Title: Multi-scale Feature Alignment for Continual Learning of Unlabeled Domains. (arXiv:2302.01287v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01287
Code URL: null
Copy Paste: [[2302.01287] Multi-scale Feature Alignment for Continual Learning of Unlabeled Domains](http://arxiv.org/abs/2302.01287) #protect
Summary:
Methods for unsupervised domain adaptation (UDA) help to improve the performance of deep neural networks on unseen domains without any labeled data. Especially in medical disciplines such as histopathology, this is crucial since large datasets with detailed annotations are scarce. While the majority of existing UDA methods focus on the adaptation from a labeled source to a single unlabeled target domain, many real-world applications with a long life cycle involve more than one target domain. Thus, the ability to sequentially adapt to multiple target domains becomes essential. In settings where the data from previously seen domains cannot be stored, e.g., due to data protection regulations, the above becomes a challenging continual learning problem. To this end, we propose to use generative feature-driven image replay in conjunction with a dual-purpose discriminator that not only enables the generation of images with realistic features for replay, but also promotes feature alignment during domain adaptation. We evaluate our approach extensively on a sequence of three histopathological datasets for tissue-type classification, achieving state-of-the-art results. We present detailed ablation experiments studying our proposed method components and demonstrate a possible use-case of our continual UDA method for an unsupervised patch-based segmentation task given high-resolution tissue images.

Title: Protecting Cache States Against Both Speculative Execution Attacks and Side-channel Attacks. (arXiv:2302.00732v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.00732
Code URL: null
Copy Paste: [[2302.00732] Protecting Cache States Against Both Speculative Execution Attacks and Side-channel Attacks](http://arxiv.org/abs/2302.00732) #protect
Summary:
Cache side-channel attacks and speculative execution attacks that leak information through cache states are stealthy and dangerous attacks on hardware that must be mitigated. Unfortunately, defenses proposed for cache side-channel attacks do not mitigate all cache-based speculative execution attacks and vice versa. Since both classes of attacks must be addressed, we propose comprehensive cache architectures to do this.

We show a framework to analyze the security of a secure cache. We identify same-domain speculative execution attacks, and show they evade cache side-channel defenses. We present new hardware security mechanisms that address target attacks and reduce performance overhead. We design two Speculative and Timing Attack Resilient (STAR) caches that defeat both cache side-channel attacks and cache-based speculative execution attacks. These comprehensive defenses have low performance overhead of 6.6% and 8.8%.

defense

Title: Beyond Pretrained Features: Noisy Image Modeling Provides Adversarial Defense. (arXiv:2302.01056v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01056
Code URL: null
Copy Paste: [[2302.01056] Beyond Pretrained Features: Noisy Image Modeling Provides Adversarial Defense](http://arxiv.org/abs/2302.01056) #defense
Summary:
Masked Image Modeling (MIM) has been a prevailing framework for self-supervised visual representation learning. Within the pretraining-finetuning paradigm, the MIM framework trains an encoder by reconstructing masked image patches with the help of a decoder which would be abandoned when the encoder is used for finetuning. Despite its state-of-the-art performance on clean images, MIM models are vulnerable to adversarial attacks, limiting its real-world application, and few studies have focused on this issue. In this paper, we have discovered that noisy image modeling (NIM), a variant of MIM that uses denoising as the pre-text task, provides not only good pretrained visual features, but also effective adversarial defense for downstream models. To achieve a better accuracy-robustness trade-off, we further propose to sample the hyperparameter that controls the reconstruction difficulty from random distributions instead of setting it globally, and fine-tune downstream networks with denoised images. Experimental results demonstrate that our pre-trained denoising autoencoders are effective against different white-box, gray-box, and black-box attacks without being trained with adversarial images, while not harming the clean accuracy of fine-tuned models. Source code and models will be made available.

Title: Order but Not Execute in Order. (arXiv:2302.01177v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01177
Code URL: null
Copy Paste: [[2302.01177] Order but Not Execute in Order](http://arxiv.org/abs/2302.01177) #defense
Summary:
We explore combining batch order-fair atomic broadcast (of-ABC) and frequent batch auction (FBA) as a defense against general order manipulations in blockchain-based decentralized exchanges (DEX). To justify FBA, we compare the welfare loss of decentralized exchanges under two market designs: continuous limit order book (CLOB), where transactions are processed sequentially, and FBA, where transactions are arranged into batches and a uniform price double auction decides execution order. We model three types of players, common investors, privately informed traders, and arbitrageurs who can provide liquidity and front-run, along with a decentralized exchange. Assuming that the exchange is realized over an of-ABC protocol, we find that FBA can achieve better social welfare compared to CLOB when (1) public information affecting the fundamental value of an asset is revealed more frequently, and/or (2) the block generation interval is sufficiently large, and/or (3) the priority fees are small compared to the asset price changes, and/or (4) fewer privately informed parties exist. Intrinsic reasons are that first, blockchains already treat time as discrete and ensuring order fairness there is non-trivial, allowing even more room for latency arbitrage rents under CLOB; second, sufficiently large block creation interval allows for information dispersion; third, higher priority fees discourage front-running under CLOB; additionally, FBA prioritizes price in deciding execution order and fewer informed traders mean less adverse price impact.

attack

Title: Are Diffusion Models Vulnerable to Membership Inference Attacks?. (arXiv:2302.01316v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01316
Code URL: null
Copy Paste: [[2302.01316] Are Diffusion Models Vulnerable to Membership Inference Attacks?](http://arxiv.org/abs/2302.01316) #attack
Summary:
Diffusion-based generative models have shown great potential for image synthesis, but there is a lack of research on the security and privacy risks they may pose. In this paper, we investigate the vulnerability of diffusion models to Membership Inference Attacks (MIAs), a common privacy concern. Our results indicate that existing MIAs designed for GANs or VAE are largely ineffective on diffusion models, either due to inapplicable scenarios (e.g., requiring the discriminator of GANs) or inappropriate assumptions (e.g., closer distances between synthetic images and member images). To address this gap, we propose Step-wise Error Comparing Membership Inference (SecMI), a black-box MIA that infers memberships by assessing the matching of forward process posterior estimation at each timestep. SecMI follows the common overfitting assumption in MIA where member samples normally have smaller estimation errors, compared with hold-out samples. We consider both the standard diffusion models, e.g., DDPM, and the text-to-image diffusion models, e.g., Stable Diffusion. Experimental results demonstrate that our methods precisely infer the membership with high confidence on both of the two scenarios across six different datasets

Title: TransFool: An Adversarial Attack against Neural Machine Translation Models. (arXiv:2302.00944v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2302.00944
Code URL: null
Copy Paste: [[2302.00944] TransFool: An Adversarial Attack against Neural Machine Translation Models](http://arxiv.org/abs/2302.00944) #attack
Summary:
Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high. Moreover, we show that TransFool is transferable to unknown target models. Finally, based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks both in white-box and black-box settings. Thus, TransFool permits us to better characterize the vulnerability of NMT models and outlines the necessity to design strong defense mechanisms and more robust NMT systems for real-life applications.

Title: Improvement and Evaluation of Resilience of Adaptive Cruise Control Against Spoofing Attacks Using Intrusion Detection System. (arXiv:2302.00876v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.00876
Code URL: null
Copy Paste: [[2302.00876] Improvement and Evaluation of Resilience of Adaptive Cruise Control Against Spoofing Attacks Using Intrusion Detection System](http://arxiv.org/abs/2302.00876) #attack
Summary:
The Adaptive Cruise Control (ACC) system automatically adjusts the vehicle speed to maintain a safe distance between the vehicle and the lead (ahead) vehicle. The controller's decision to accelerate or decelerate is computed using the target speed of the vehicle and the difference between the vehicle's distance to the lead vehicle and the safe distance from that vehicle. Spoofing the vehicle speed communicated through the Controller Area Network (CAN) of the vehicle impacts negatively the capability of the ACC (Proportional-Integral-Derivative variant) to prevent crashes with the lead vehicle. The paper reports about extending the ACC with a real-time Intrusion Detection System (IDS) capable of detecting speed spoofing attacks with reasonable response time and detection rate, and simulating the proposed extension using the CARLA simulation platform. The results of the simulation are: (1) spoofing the vehicle speed can foil the ACC to falsely accelerate, causing accidents, and (2) extending ACC with ML-based IDS to trigger the brakes when an accident is imminent may mitigate the problem. The findings suggest exploring the capabilities of ML-based IDS to support the resilience mechanisms in mitigating cyber-attacks on vehicles.

Title: SPECWANDS: An Efficient Priority-based Scheduler Against Speculation Contention Attacks. (arXiv:2302.00947v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.00947
Code URL: null
Copy Paste: [[2302.00947] SPECWANDS: An Efficient Priority-based Scheduler Against Speculation Contention Attacks](http://arxiv.org/abs/2302.00947) #attack
Summary:
Transient Execution Attacks (TEAs) have gradually become a major security threat to modern high-performance processors. They exploit the vulnerability of speculative execution to illegally access private data, and transmit them through timing-based covert channels. While new vulnerabilities are discovered continuously, the covert channels can be categorised to two types: 1) Persistent Type, in which covert channels are based on the layout changes of buffering, e.g. through caches or TLBs; 2) Volatile Type, in which covert channels are based on the contention of sharing resources, e.g. through execution units or issuing ports. The defenses against the persistent-type covert channels have been well addressed, while those for the volatile-type are still rather inadequate. Existing mitigation schemes for the volatile type such as Speculative Compression and Time-Division-Multiplexing will introduce significant overhead due to the need to stall the pipeline or to disallow resource sharing. In this paper, we look into such attacks and defenses with a new perspective, and propose a scheduling-based mitigation scheme, called SPECWANDS. It consists of three priority-based scheduling policies to prevent an attacker from transmitting the secret in different contention situations. SPECWANDS not only can defend against both inter-thread and intra-thread based attacks, but also can keep most of the performance benefit from speculative execution and resource-sharing. We evaluate its runtime overhead on SPEC 2006/2017 benchmarks. The experimental results show that SPECWANDS has a significant performance advantage (3%-5%) over the other two representative schemes.

Title: An Attack on The Speculative Vectorization: Leakage from Higher Dimensional Speculation. (arXiv:2302.01131v1 [cs.CR])

Paper URL: http://arxiv.org/abs/2302.01131
Code URL: null
Copy Paste: [[2302.01131] An Attack on The Speculative Vectorization: Leakage from Higher Dimensional Speculation](http://arxiv.org/abs/2302.01131) #attack
Summary:
This paper argues and shows that speculative vectorization, where a loop with rare or unknown memory dependencies are still vectorized, is fundamentally vulnerable and cannot be mitigated by existing defenses. We implement a simple proof of concept and show the leakage in Apple M2 SoC. We describe the source of leakage using Microarchitectural Leakage Descriptors MLD and we additionally describe principles to extend MLD for other optimization. Also as part of implementation we reverse engineer the M2 cache size and use threaded timer to differentiate between cache hit and miss.

Title: Universal Soldier: Using Universal Adversarial Perturbations for Detecting Backdoor Attacks. (arXiv:2302.00747v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00747
Code URL: null
Copy Paste: [[2302.00747] Universal Soldier: Using Universal Adversarial Perturbations for Detecting Backdoor Attacks](http://arxiv.org/abs/2302.00747) #attack
Summary:
Deep learning models achieve excellent performance in numerous machine learning tasks. Yet, they suffer from security-related issues such as adversarial examples and poisoning (backdoor) attacks. A deep learning model may be poisoned by training with backdoored data or by modifying inner network parameters. Then, a backdoored model performs as expected when receiving a clean input, but it misclassifies when receiving a backdoored input stamped with a pre-designed pattern called "trigger". Unfortunately, it is difficult to distinguish between clean and backdoored models without prior knowledge of the trigger. This paper proposes a backdoor detection method by utilizing a special type of adversarial attack, universal adversarial perturbation (UAP), and its similarities with a backdoor trigger. We observe an intuitive phenomenon: UAPs generated from backdoored models need fewer perturbations to mislead the model than UAPs from clean models. UAPs of backdoored models tend to exploit the shortcut from all classes to the target class, built by the backdoor trigger. We propose a novel method called Universal Soldier for Backdoor detection (USB) and reverse engineering potential backdoor triggers via UAPs. Experiments on 345 models trained on several datasets show that USB effectively detects the injected backdoor and provides comparable or better results than state-of-the-art methods.

robust

Title: RobustNeRF: Ignoring Distractors with Robust Losses. (arXiv:2302.00833v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00833
Code URL: null
Copy Paste: [[2302.00833] RobustNeRF: Ignoring Distractors with Robust Losses](http://arxiv.org/abs/2302.00833) #robust
Summary:
Neural radiance fields (NeRF) excel at synthesizing new views given multi-view, calibrated images of a static scene. When scenes include distractors, which are not persistent during image capture (moving objects, lighting variations, shadows), artifacts appear as view-dependent effects or 'floaters'. To cope with distractors, we advocate a form of robust estimation for NeRF training, modeling distractors in training data as outliers of an optimization problem. Our method successfully removes outliers from a scene and improves upon our baselines, on synthetic and real-world scenes. Our technique is simple to incorporate in modern NeRF frameworks, with few hyper-parameters. It does not assume a priori knowledge of the types of distractors, and is instead focused on the optimization problem rather than pre-processing or modeling transient objects. More results on our page https://robustnerf.github.io/public.

Title: SHINE: Deep Learning-Based Accessible Parking Management System. (arXiv:2302.00837v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00837
Code URL: null
Copy Paste: [[2302.00837] SHINE: Deep Learning-Based Accessible Parking Management System](http://arxiv.org/abs/2302.00837) #robust
Summary:
The enhancement of science and technology has helped expand urban cities like never before. Due to the undeniable benefits of owning a private vehicle, the number of cars has rocketed in many parts of the world, including South Korea. However, these gradual increments in the number of vehicles lead to parking-related problems, including the abuse of disabled parking spaces (referred to as accessible parking spaces hereafter). Due to the high frame rate of surveillance cameras, traditional license plate recognition (LPR) systems are ineffective in real-time. On the other hand, natural and artificial noise and differences in lighting and weather conditions make detection and recognition difficult for these systems. With the growing concept of parking 4.0, many sensors, IoT and deep learning-based approaches have been applied to automatic LPR and parking management systems. However, the studies show a need for a robust and efficient model for managing accessible parking spaces in South Korea. We have proposed a novel system called 'SHINE', which uses the deep learning-based object detection algorithm for detecting the vehicle, license plate, and disability badges (referred to as cards, badges, or access badges hereafter) and then authenticates the rights to use the accessible parking spaces by coordinating with the central server. This model, achieving 92.16% mean average precision, is believed to solve the problem of accessible parking space abuse.

Title: Exploring Invariant Representation for Visible-Infrared Person Re-Identification. (arXiv:2302.00884v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00884
Code URL: null
Copy Paste: [[2302.00884] Exploring Invariant Representation for Visible-Infrared Person Re-Identification](http://arxiv.org/abs/2302.00884) #robust
Summary:
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy. In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM). In particular, we observe that the reflective intensity of the same surface in photos shot in different wavelengths could be transformed using a linear model. Besides, we show the variable linear factor across the different surfaces is the main culprit which initiates the modality discrepancy. We integrate such a reflection observation into an image-level data augmentation by proposing the linear transformation generator (LTG). Moreover, at the feature level, we introduce a cross-center loss to explore a more compact intra-class distribution and modality-aware spatial attention to take advantage of textured regions more efficiently. Experiment results on two standard cross-spectral person re-identification datasets, i.e., RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.

Title: Open-Set Multi-Source Multi-Target Domain Adaptation. (arXiv:2302.00995v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00995
Code URL: null
Copy Paste: [[2302.00995] Open-Set Multi-Source Multi-Target Domain Adaptation](http://arxiv.org/abs/2302.00995) #robust
Summary:
Single-Source Single-Target Domain Adaptation (1S1T) aims to bridge the gap between a labelled source domain and an unlabelled target domain. Despite 1S1T being a well-researched topic, they are typically not deployed to the real world. Methods like Multi-Source Domain Adaptation and Multi-Target Domain Adaptation have evolved to model real-world problems but still do not generalise well. The fact that most of these methods assume a common label-set between source and target is very restrictive. Recent Open-Set Domain Adaptation methods handle unknown target labels but fail to generalise in multiple domains. To overcome these difficulties, first, we propose a novel generic domain adaptation (DA) setting named Open-Set Multi-Source Multi-Target Domain Adaptation (OS-nSmT), with n and m being number of source and target domains respectively. Next, we propose a graph attention based framework named DEGAA which can capture information from multiple source and target domains without knowing the exact label-set of the target. We argue that our method, though offered for multiple sources and multiple targets, can also be agnostic to various other DA settings. To check the robustness and versatility of DEGAA, we put forward ample experiments and ablation studies.

Title: Paced-Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation. (arXiv:2302.01049v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01049
Code URL: https://github.com/mobarakol/p-cd
Copy Paste: [[2302.01049] Paced-Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation](http://arxiv.org/abs/2302.01049) #robust
Summary:
Purpose: In curriculum learning, the idea is to train on easier samples first and gradually increase the difficulty, while in self-paced learning, a pacing function defines the speed to adapt the training progress. While both methods heavily rely on the ability to score the difficulty of data samples, an optimal scoring function is still under exploration. Methodology: Distillation is a knowledge transfer approach where a teacher network guides a student network by feeding a sequence of random samples. We argue that guiding student networks with an efficient curriculum strategy can improve model generalization and robustness. For this purpose, we design an uncertainty-based paced curriculum learning in self distillation for medical image segmentation. We fuse the prediction uncertainty and annotation boundary uncertainty to develop a novel paced-curriculum distillation (PCD). We utilize the teacher model to obtain prediction uncertainty and spatially varying label smoothing with Gaussian kernel to generate segmentation boundary uncertainty from the annotation. We also investigate the robustness of our method by applying various types and severity of image perturbation and corruption. Results: The proposed technique is validated on two medical datasets of breast ultrasound image segmentation and robotassisted surgical scene segmentation and achieved significantly better performance in terms of segmentation and robustness. Conclusion: P-CD improves the performance and obtains better generalization and robustness over the dataset shift. While curriculum learning requires extensive tuning of hyper-parameters for pacing function, the level of performance improvement suppresses this limitation.

Title: GraphReg: Dynamical Point Cloud Registration with Geometry-aware Graph Signal Processing. (arXiv:2302.01109v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01109
Code URL: https://github.com/zikai1/graphreg
Copy Paste: [[2302.01109] GraphReg: Dynamical Point Cloud Registration with Geometry-aware Graph Signal Processing](http://arxiv.org/abs/2302.01109) #robust
Summary:
This study presents a high-accuracy, efficient, and physically induced method for 3D point cloud registration, which is the core of many important 3D vision problems. In contrast to existing physics-based methods that merely consider spatial point information and ignore surface geometry, we explore geometry aware rigid-body dynamics to regulate the particle (point) motion, which results in more precise and robust registration. Our proposed method consists of four major modules. First, we leverage the graph signal processing (GSP) framework to define a new signature, (i.e., point response intensity for each point), by which we succeed in describing the local surface variation, resampling keypoints, and distinguishing different particles. Then, to address the shortcomings of current physics-based approaches that are sensitive to outliers, we accommodate the defined point response intensity to median absolute deviation (MAD) in robust statistics and adopt the X84 principle for adaptive outlier depression, ensuring a robust and stable registration. Subsequently, we propose a novel geometric invariant under rigid transformations to incorporate higher-order features of point clouds, which is further embedded for force modeling to guide the correspondence between pairwise scans credibly. Finally, we introduce an adaptive simulated annealing (ASA) method to search for the global optimum and substantially accelerate the registration process. We perform comprehensive experiments to evaluate the proposed method on various datasets captured from range scanners to LiDAR. Results demonstrate that our proposed method outperforms representative state-of-the-art approaches in terms of accuracy and is more suitable for registering large-scale point clouds. Furthermore, it is considerably faster and more robust than most competitors.

Title: Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt. (arXiv:2302.01171v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01171
Code URL: null
Copy Paste: [[2302.01171] Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt](http://arxiv.org/abs/2302.01171) #robust
Summary:
Recently, inspired by DETR variants, query-based end-to-end instance segmentation (QEIS) methods have outperformed CNN-based models on large-scale datasets. Yet they would lose efficacy when only a small amount of training data is available since it's hard for the crucial queries/kernels to learn localization and shape priors. To this end, this work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels. Our method contains three parts: 1) Saliency Masks Proposal is responsible for generating pseudo masks from unlabeled images based on the saliency mechanism. 2) Prompt-Kernel Matching transfers pseudo masks into prompts and injects the corresponding localization and shape priors to the best-matched kernels. 3) Kernel Supervision is applied to supply supervision at the kernel level for robust learning. From a practical perspective, our pre-training method helps QEIS models achieve a similar convergence speed and comparable performance with CNN-based models in low-data regimes. Experimental results show that our method significantly boosts several QEIS models on three datasets. Code will be made available.

Title: Model Monitoring and Robustness of In-Use Machine Learning Models: Quantifying Data Distribution Shifts Using Population Stability Index. (arXiv:2302.00775v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00775
Code URL: null
Copy Paste: [[2302.00775] Model Monitoring and Robustness of In-Use Machine Learning Models: Quantifying Data Distribution Shifts Using Population Stability Index](http://arxiv.org/abs/2302.00775) #robust
Summary:
Safety goes first. Meeting and maintaining industry safety standards for robustness of artificial intelligence (AI) and machine learning (ML) models require continuous monitoring for faults and performance drops. Deep learning models are widely used in industrial applications, e.g., computer vision, but the susceptibility of their performance to environment changes (e.g., noise) \emph{after deployment} on the product, are now well-known. A major challenge is detecting data distribution shifts that happen, comparing the following: {\bf (i)} development stage of AI and ML models, i.e., train/validation/test, to {\bf (ii)} deployment stage on the product (i.e., even after `testing') in the environment. We focus on a computer vision example related to autonomous driving and aim at detecting shifts that occur as a result of adding noise to images. We use the population stability index (PSI) as a measure of presence and intensity of shift and present results of our empirical experiments showing a promising potential for the PSI. We further discuss multiple aspects of model monitoring and robustness that need to be analyzed \emph{simultaneously} to achieve robustness for industry safety standards. We propose the need for and the research direction toward \emph{categorizations} of problem classes and examples where monitoring for robustness is required and present challenges and pointers for future work from a \emph{practical} perspective.

Title: An Enhanced V-cycle MgNet Model for Operator Learning in Numerical Partial Differential Equations. (arXiv:2302.00938v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00938
Code URL: null
Copy Paste: [[2302.00938] An Enhanced V-cycle MgNet Model for Operator Learning in Numerical Partial Differential Equations](http://arxiv.org/abs/2302.00938) #robust
Summary:
This study used a multigrid-based convolutional neural network architecture known as MgNet in operator learning to solve numerical partial differential equations (PDEs). Given the property of smoothing iterations in multigrid methods where low-frequency errors decay slowly, we introduced a low-frequency correction structure for residuals to enhance the standard V-cycle MgNet. The enhanced MgNet model can capture the low-frequency features of solutions considerably better than the standard V-cycle MgNet. The numerical results obtained using some standard operator learning tasks are better than those obtained using many state-of-the-art methods, demonstrating the efficiency of our model.Moreover, numerically, our new model is more robust in case of low- and high-resolution data during training and testing, respectively.

Title: Predicting Molecule-Target Interaction by Learning Biomedical Network and Molecule Representations. (arXiv:2302.00981v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00981
Code URL: null
Copy Paste: [[2302.00981] Predicting Molecule-Target Interaction by Learning Biomedical Network and Molecule Representations](http://arxiv.org/abs/2302.00981) #robust
Summary:
The study of molecule-target interaction is quite important for drug discovery in terms of target identification, pathway study, drug-drug interaction, etc. Most existing methodologies utilize either biomedical network information or molecule structural features to predict potential interaction link. However, the biomedical network information based methods usually suffer from cold start problem, while structure based methods often give limited performance due to the structure/interaction assumption and data quality. To address these issues, we propose a pseudo-siamese Graph Neural Network method, namely MTINet+, which learns both biomedical network topological and molecule structural/chemical information as representations to predict potential interaction of given molecule and target pair. In MTINet+, 1-hop subgraphs of given molecule and target pair are extracted from known interaction of biomedical network as topological information, meanwhile the molecule structural and chemical attributes are processed as molecule information. MTINet+ learns these two types of information as embedding features for predicting the pair link. In the experiments of different molecule-target interaction tasks, MTINet+ significantly outperforms over the state-of-the-art baselines. In addition, in our designed network sparsity experiments , MTINet+ shows strong robustness against different sparse biomedical networks.

Title: Constrained Online Two-stage Stochastic Optimization: New Algorithms via Adversarial Learning. (arXiv:2302.00997v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00997
Code URL: null
Copy Paste: [[2302.00997] Constrained Online Two-stage Stochastic Optimization: New Algorithms via Adversarial Learning](http://arxiv.org/abs/2302.00997) #robust
Summary:
We consider an online two-stage stochastic optimization with long-term constraints over a finite horizon of $T$ periods. At each period, we take the first-stage action, observe a model parameter realization and then take the second-stage action from a feasible set that depends both on the first-stage decision and the model parameter. We aim to minimize the cumulative objective value while guaranteeing that the long-term average second-stage decision belongs to a set. We propose a general algorithmic framework that derives online algorithms for the online two-stage problem from adversarial learning algorithms. Also, the regret bound of our algorithm cam be reduced to the regret bound of embedded adversarial learning algorithms. Based on our framework, we obtain new results under various settings. When the model parameter at each period is drawn from identical distributions, we derive state-of-art regret bound that improves previous bounds under special cases. Our algorithm is also robust to adversarial corruptions of model parameter realizations. When the model parameters are drawn from unknown non-stationary distributions and we are given prior estimates of the distributions, we develop a new algorithm from our framework with a regret $O(W_T+\sqrt{T})$, where $W_T$ measures the total inaccuracy of the prior estimates.

Title: Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation. (arXiv:2302.01094v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01094
Code URL: null
Copy Paste: [[2302.01094] Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation](http://arxiv.org/abs/2302.01094) #robust
Summary:
This work aims to assess how well a model performs under distribution shifts without using labels. While recent methods study prediction confidence, this work reports prediction dispersity is another informative cue. Confidence reflects whether the individual prediction is certain; dispersity indicates how the overall predictions are distributed across all categories. Our key insight is that a well-performing model should give predictions with high confidence and high dispersity. That is, we need to consider both properties so as to make more accurate estimates. To this end, we use the nuclear norm that has been shown to be effective in characterizing both properties. Extensive experiments validate the effectiveness of nuclear norm for various models (e.g., ViT and ConvNeXt), different datasets (e.g., ImageNet and CUB-200), and diverse types of distribution shifts (e.g., style shift and reproduction shift). We show that the nuclear norm is more accurate and robust in accuracy estimation than existing methods. Furthermore, we validate the feasibility of other measurements (e.g., mutual information maximization) for characterizing dispersity and confidence. Lastly, we investigate the limitation of the nuclear norm, study its improved variant under severe class imbalance, and discuss potential directions.

Title: A general Markov decision process formalism for action-state entropy-regularized reward maximization. (arXiv:2302.01098v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01098
Code URL: null
Copy Paste: [[2302.01098] A general Markov decision process formalism for action-state entropy-regularized reward maximization](http://arxiv.org/abs/2302.01098) #robust
Summary:
Previous work has separately addressed different forms of action, state and action-state entropy regularization, pure exploration and space occupation. These problems have become extremely relevant for regularization, generalization, speeding up learning and providing robust solutions at unprecedented levels. However, solutions of those problems are hectic, ranging from convex and non-convex optimization, and unconstrained optimization to constrained optimization. Here we provide a general dual function formalism that transforms the constrained optimization problem into an unconstrained convex one for any mixture of action and state entropies. The cases with pure action entropy and pure state entropy are understood as limits of the mixture.

Title: STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition. (arXiv:2302.01172v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01172
Code URL: null
Copy Paste: [[2302.01172] STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition](http://arxiv.org/abs/2302.01172) #robust
Summary:
Recent innovations on hardware (e.g. Nvidia A100) have motivated learning N:M structured sparsity masks from scratch for fast model inference. However, state-of-the-art learning recipes in this regime (e.g. SR-STE) are proposed for non-adaptive optimizers like momentum SGD, while incurring non-trivial accuracy drop for Adam-trained models like attention-based LLMs. In this paper, we first demonstrate such gap origins from poorly estimated second moment (i.e. variance) in Adam states given by the masked weights. We conjecture that learning N:M masks with Adam should take the critical regime of variance estimation into account. In light of this, we propose STEP, an Adam-aware recipe that learns N:M masks with two phases: first, STEP calculates a reliable variance estimate (precondition phase) and subsequently, the variance remains fixed and is used as a precondition to learn N:M masks (mask-learning phase). STEP automatically identifies the switching point of two phases by dynamically sampling variance changes over the training trajectory and testing the sample concentration. Empirically, we evaluate STEP and other baselines such as ASP and SR-STE on multiple tasks including CIFAR classification, machine translation and LLM fine-tuning (BERT-Base, GPT-2). We show STEP mitigates the accuracy drop of baseline recipes and is robust to aggressive structured sparsity ratios.

Title: Convolutional Neural Operators. (arXiv:2302.01178v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01178
Code URL: null
Copy Paste: [[2302.01178] Convolutional Neural Operators](http://arxiv.org/abs/2302.01178) #robust
Summary:
Although very successfully used in machine learning, convolution based neural network architectures -- believed to be inconsistent in function space -- have been largely ignored in the context of learning solution operators of PDEs. Here, we adapt convolutional neural networks to demonstrate that they are indeed able to process functions as inputs and outputs. The resulting architecture, termed as convolutional neural operators (CNOs), is shown to significantly outperform competing models on benchmark experiments, paving the way for the design of an alternative robust and accurate framework for learning operators.

Title: The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing. (arXiv:2302.01186v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01186
Code URL: null
Copy Paste: [[2302.01186] The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing](http://arxiv.org/abs/2302.01186) #robust
Summary:
We propose $\textsf{ScaledGD($\lambda$)}$, a preconditioned gradient descent method to tackle the low-rank matrix sensing problem when the true rank is unknown, and when the matrix is possibly ill-conditioned. Using overparametrized factor representations, $\textsf{ScaledGD($\lambda$)}$ starts from a small random initialization, and proceeds by gradient descent with a specific form of damped preconditioning to combat bad curvatures induced by overparameterization and ill-conditioning. At the expense of light computational overhead incurred by preconditioners, $\textsf{ScaledGD($\lambda$)}$ is remarkably robust to ill-conditioning compared to vanilla gradient descent ($\textsf{GD}$) even with overprameterization. Specifically, we show that, under the Gaussian design, $\textsf{ScaledGD($\lambda$)}$ converges to the true low-rank matrix at a constant linear rate after a small number of iterations that scales only logarithmically with respect to the condition number and the problem dimension. This significantly improves over the convergence rate of vanilla $\textsf{GD}$ which suffers from a polynomial dependency on the condition number. Our work provides evidence on the power of preconditioning in accelerating the convergence without hurting generalization in overparameterized learning.

Title: Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs. (arXiv:2302.01204v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01204
Code URL: https://github.com/shenyanghuang/multilad
Copy Paste: [[2302.01204] Laplacian Change Point Detection for Single and Multi-view Dynamic Graphs](http://arxiv.org/abs/2302.01204) #robust
Summary:
Dynamic graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly detection in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address three main challenges associated with this problem: i). how to compare graph snapshots across time, ii). how to capture temporal dependencies, and iii). how to combine different views of a temporal graph. To solve the above challenges, we first propose Laplacian Anomaly Detection (LAD) which uses the spectrum of graph Laplacian as the low dimensional embedding of the graph structure at each snapshot. LAD explicitly models short term and long term dependencies by applying two sliding windows. Next, we propose MultiLAD, a simple and effective generalization of LAD to multi-view graphs. MultiLAD provides the first change point detection method for multi-view dynamic graphs. It aggregates the singular values of the normalized graph Laplacian from different views through the scalar power mean operation. Through extensive synthetic experiments, we show that i). LAD and MultiLAD are accurate and outperforms state-of-the-art baselines and their multi-view extensions by a large margin, ii). MultiLAD's advantage over contenders significantly increases when additional views are available, and iii). MultiLAD is highly robust to noise from individual views. In five real world dynamic graphs, we demonstrate that LAD and MultiLAD identify significant events as top anomalies such as the implementation of government COVID-19 interventions which impacted the population mobility in multi-view traffic networks.

Title: Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function. (arXiv:2302.01244v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01244
Code URL: null
Copy Paste: [[2302.01244] Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function](http://arxiv.org/abs/2302.01244) #robust
Summary:
Probabilistic dynamics model ensemble is widely used in existing model-based reinforcement learning methods as it outperforms a single dynamics model in both asymptotic performance and sample efficiency. In this paper, we provide both practical and theoretical insights on the empirical success of the probabilistic dynamics model ensemble through the lens of Lipschitz continuity. We find that, for a value function, the stronger the Lipschitz condition is, the smaller the gap between the true dynamics- and learned dynamics-induced Bellman operators is, thus enabling the converged value function to be closer to the optimal value function. Hence, we hypothesize that the key functionality of the probabilistic dynamics model ensemble is to regularize the Lipschitz condition of the value function using generated samples. To test this hypothesis, we devise two practical robust training mechanisms through computing the adversarial noise and regularizing the value network's spectral norm to directly regularize the Lipschitz condition of the value functions. Empirical results show that combined with our mechanisms, model-based RL algorithms with a single dynamics model outperform those with an ensemble of probabilistic dynamics models. These findings not only support the theoretical insight, but also provide a practical solution for developing computationally efficient model-based RL algorithms.

biometric

steal

extraction

Title: Vision Transformer-based Feature Extraction for Generalized Zero-Shot Learning. (arXiv:2302.00875v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00875
Code URL: null
Copy Paste: [[2302.00875] Vision Transformer-based Feature Extraction for Generalized Zero-Shot Learning](http://arxiv.org/abs/2302.00875) #extraction
Summary:
Generalized zero-shot learning (GZSL) is a technique to train a deep learning model to identify unseen classes using the image attribute. In this paper, we put forth a new GZSL approach exploiting Vision Transformer (ViT) to maximize the attribute-related information contained in the image feature. In ViT, the entire image region is processed without the degradation of the image resolution and the local image information is preserved in patch features. To fully enjoy these benefits of ViT, we exploit patch features as well as the CLS feature in extracting the attribute-related image feature. In particular, we propose a novel attention-based module, called attribute attention module (AAM), to aggregate the attribute-related information in patch features. In AAM, the correlation between each patch feature and the synthetic image attribute is used as the importance weight for each patch. From extensive experiments on benchmark datasets, we demonstrate that the proposed technique outperforms the state-of-the-art GZSL approaches by a large margin.

Title: An Efficient Convex Hull-Based Vehicle Pose Estimation Method for 3D LiDAR. (arXiv:2302.01034v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01034
Code URL: null
Copy Paste: [[2302.01034] An Efficient Convex Hull-Based Vehicle Pose Estimation Method for 3D LiDAR](http://arxiv.org/abs/2302.01034) #extraction
Summary:
Vehicle pose estimation is essential in the perception technology of autonomous driving. However, due to the different density distributions of the LiDAR point cloud, it is challenging to achieve accurate direction extraction based on 3D LiDAR by using the existing pose estimation methods. In this paper, we proposed a novel convex hull-based vehicle pose estimation method. The extracted 3D cluster is reduced to the convex hull, reducing the computation burden. Then a novel criterion based on the minimum occlusion area is developed for the search-based algorithm, which can achieve accurate pose estimation. The proposed algorithm is validated on the KITTI dataset and a manually labeled dataset acquired at an industrial park. The results show that our proposed method can achieve better accuracy than the three mainstream algorithms while maintaining real-time speed.

Title: Combining Deep Neural Reranking and Unsupervised Extraction for Multi-Query Focused Summarization. (arXiv:2302.01148v1 [cs.CL])

Paper URL: http://arxiv.org/abs/2302.01148
Code URL: null
Copy Paste: [[2302.01148] Combining Deep Neural Reranking and Unsupervised Extraction for Multi-Query Focused Summarization](http://arxiv.org/abs/2302.01148) #extraction
Summary:
The CrisisFACTS Track aims to tackle challenges such as multi-stream fact-finding in the domain of event tracking; participants' systems extract important facts from several disaster-related events while incorporating the temporal order. We propose a combination of retrieval, reranking, and the well-known Integer Linear Programming (ILP) and Maximal Marginal Relevance (MMR) frameworks. In the former two modules, we explore various methods including an entity-based baseline, pre-trained and fine-tuned Question Answering systems, and ColBERT. We then use the latter module as an extractive summarization component by taking diversity and novelty criteria into account. The automatic scoring runs show strong results across the evaluation setups but also reveal shortcomings and challenges.

Title: Variational Autoencoder Learns Better Feature Representations for EEG-based Obesity Classification. (arXiv:2302.00789v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00789
Code URL: null
Copy Paste: [[2302.00789] Variational Autoencoder Learns Better Feature Representations for EEG-based Obesity Classification](http://arxiv.org/abs/2302.00789) #extraction
Summary:
Obesity is a common issue in modern societies today that can lead to various diseases and significantly reduced quality of life. Currently, research has been conducted to investigate resting state EEG (electroencephalogram) signals with an aim to identify possible neurological characteristics associated with obesity. In this study, we propose a deep learning-based framework to extract the resting state EEG features for obese and lean subject classification. Specifically, a novel variational autoencoder framework is employed to extract subject-invariant features from the raw EEG signals, which are then classified by a 1-D convolutional neural network. Comparing with conventional machine learning and deep learning methods, we demonstrate the superiority of using VAE for feature extraction, as reflected by the significantly improved classification accuracies, better visualizations and reduced impurity measures in the feature representations. Future work can be directed to gaining an in-depth understanding regarding the spatial patterns that have been learned by the proposed model from a neurological view, as well as improving the interpretability of the proposed model by allowing it to uncover any temporal-related information.

Title: Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data. (arXiv:2302.00834v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00834
Code URL: null
Copy Paste: [[2302.00834] Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data](http://arxiv.org/abs/2302.00834) #extraction
Summary:
We study the interpolation, or memorization, power of deep ReLU neural networks. Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $\delta$. We show that $\Omega(N)$ parameters are required in the regime where $\delta$ is exponentially small in $N$, which gives the sharp result in this regime since $O(N)$ parameters are always sufficient. This also shows that the bit-extraction technique used to prove lower bounds on the VC dimension cannot be applied to irregularly spaced datapoints.

membership infer

federate

Title: No One Left Behind: Real-World Federated Class-Incremental Learning. (arXiv:2302.00903v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00903
Code URL: https://github.com/jiahuadong/lga
Copy Paste: [[2302.00903] No One Left Behind: Real-World Federated Class-Incremental Learning](http://arxiv.org/abs/2302.00903) #federate
Summary:
Federated learning (FL) is a hot collaborative training framework via aggregating model parameters of decentralized local clients. However, most existing models unreasonably assume that data categories of FL framework are known and fxed in advance. It renders the global model to signifcantly degrade recognition performance on old categories (i.e., catastrophic forgetting), when local clients receive new categories consecutively under limited memory of storing old categories. Moreover, some new local clients that collect novel categories unseen by other clients may be introduced to the FL training irregularly, which further exacerbates the catastrophic forgetting on old categories. To tackle the above issues, we propose a novel Local-Global Anti-forgetting (LGA) model to address local and global catastrophic forgetting on old categories, which is a pioneering work to explore a global class-incremental model in the FL feld. Specifcally, considering tackling class imbalance of local client to surmount local forgetting, we develop a category-balanced gradient-adaptive compensation loss and a category gradient-induced semantic distillation loss. They can balance heterogeneous forgetting speeds of hard-to-forget and easy-to-forget old categories, while ensure intrinsic class relations consistency within different incremental tasks. Moreover, a proxy server is designed to tackle global forgetting caused by Non-IID class imbalance between different clients. It collects perturbed prototype images of new categories from local clients via prototype gradient communication under privacy preservation, and augments them via self-supervised prototype augmentation to choose the best old global model and improve local distillation gain. Experiments on representative datasets verify superior performance of our model against other comparison methods.

Title: Federated Analytics: A survey. (arXiv:2302.01326v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01326
Code URL: null
Copy Paste: [[2302.01326] Federated Analytics: A survey](http://arxiv.org/abs/2302.01326) #federate
Summary:
Federated analytics (FA) is a privacy-preserving framework for computing data analytics over multiple remote parties (e.g., mobile devices) or silo-ed institutional entities (e.g., hospitals, banks) without sharing the data among parties. Motivated by the practical use cases of federated analytics, we follow a systematic discussion on federated analytics in this article. In particular, we discuss the unique characteristics of federated analytics and how it differs from federated learning. We also explore a wide range of FA queries and discuss various existing solutions and potential use case applications for different FA queries.

fair

Title: Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations. (arXiv:2302.01079v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.01079
Code URL: null
Copy Paste: [[2302.01079] Uncertainty in Fairness Assessment: Maintaining Stable Conclusions Despite Fluctuations](http://arxiv.org/abs/2302.01079) #fair
Summary:
Several recent works encourage the use of a Bayesian framework when assessing performance and fairness metrics of a classification algorithm in a supervised setting. We propose the Uncertainty Matters (UM) framework that generalizes a Beta-Binomial approach to derive the posterior distribution of any criteria combination, allowing stable performance assessment in a bias-aware setting.We suggest modeling the confusion matrix of each demographic group using a Multinomial distribution updated through a Bayesian procedure. We extend UM to be applicable under the popular K-fold cross-validation procedure. Experiments highlight the benefits of UM over classical evaluation frameworks regarding informativeness and stability.

interpretability

Title: SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis. (arXiv:2302.00785v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.00785
Code URL: null
Copy Paste: [[2302.00785] SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis](http://arxiv.org/abs/2302.00785) #interpretability
Summary:
For the deployment of artificial intelligence (AI) in high-risk settings, such as healthcare, methods that provide interpretability/explainability or allow fine-grained error analysis are critical. Many recent methods for interpretability/explainability and fine-grained error analysis use concepts, which are meta-labels that are semantically meaningful to humans. However, there are only a few datasets that include concept-level meta-labels and most of these meta-labels are relevant for natural images that do not require domain expertise. Densely annotated datasets in medicine focused on meta-labels that are relevant to a single disease such as melanoma. In dermatology, skin disease is described using an established clinical lexicon that allows clinicians to describe physical exam findings to one another. To provide a medical dataset densely annotated by domain experts with annotations useful across multiple disease processes, we developed SkinCon: a skin disease dataset densely annotated by dermatologists. SkinCon includes 3230 images from the Fitzpatrick 17k dataset densely annotated with 48 clinical concepts, 22 of which have at least 50 images representing the concept. The concepts used were chosen by two dermatologists considering the clinical descriptor terms used to describe skin lesions. Examples include "plaque", "scale", and "erosion". The same concepts were also used to label 656 skin disease images from the Diverse Dermatology Images dataset, providing an additional external dataset with diverse skin tone representations. We review the potential applications for the SkinCon dataset, such as probing models, concept-based explanations, and concept bottlenecks. Furthermore, we use SkinCon to demonstrate two of these use cases: debugging mistakes of an existing dermatology AI model with concepts and developing interpretable models with post-hoc concept bottleneck models.

explainability

watermark

diffusion

Title: Dreamix: Video Diffusion Models are General Video Editors. (arXiv:2302.01329v1 [cs.CV])

Paper URL: http://arxiv.org/abs/2302.01329
Code URL: null
Copy Paste: [[2302.01329] Dreamix: Video Diffusion Models are General Video Editors](http://arxiv.org/abs/2302.01329) #diffusion
Summary:
Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the first diffusion-based method that is able to perform text-based motion and appearance editing of general videos. Our approach uses a video diffusion model to combine, at inference time, the low-resolution spatio-temporal information from the original video with new, high resolution information that it synthesized to align with the guiding text prompt. As obtaining high-fidelity to the original video requires retaining some of its high-resolution information, we add a preliminary stage of finetuning the model on the original video, significantly boosting fidelity. We propose to improve motion editability by a new, mixed objective that jointly finetunes with full temporal attention and with temporal attention masking. We further introduce a new framework for image animation. We first transform the image into a coarse video by simple image processing operations such as replication and perspective geometric projections, and then use our general video editor to animate it. As a further application, we can use our method for subject-driven video generation. Extensive qualitative and numerical experiments showcase the remarkable editing ability of our method and establish its superior performance compared to baseline methods.

Title: Efficient Graph Field Integrators Meet Point Clouds. (arXiv:2302.00942v1 [cs.LG])

Paper URL: http://arxiv.org/abs/2302.00942
Code URL: null
Copy Paste: [[2302.00942] Efficient Graph Field Integrators Meet Point Clouds](http://arxiv.org/abs/2302.00942) #diffusion
Summary:
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds. The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds. Both can be viewed as providing the functionality of Fast Multipole Methods (FMMs), which have had a tremendous impact on efficient integration, but for non-Euclidean spaces. We focus on geometries induced by distributions of walk lengths between points (e.g., shortest-path distance). We provide an extensive theoretical analysis of our algorithms, obtaining new results in structural graph theory as a byproduct. We also perform exhaustive empirical evaluation, including on-surface interpolation for rigid and deformable objects (particularly for mesh-dynamics modeling), Wasserstein distance computations for point clouds, and the Gromov-Wasserstein variant.