diffusion

Title: RSDiff: Remote Sensing Image Generation from Text Using Diffusion Model. (arXiv:2309.02455v1 [cs.CV])

Title: Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter. (arXiv:2309.02773v1 [cs.CV])

Title: MCM: Multi-condition Motion Synthesis Framework for Multi-scenario. (arXiv:2309.03031v1 [cs.CV])

Title: SLiMe: Segment Like Me. (arXiv:2309.03179v1 [cs.CV])

Title: My Art My Choice: Adversarial Protection Against Unruly AI. (arXiv:2309.03198v1 [cs.CV])

Title: Diffusion on the Probability Simplex. (arXiv:2309.02530v1 [cs.LG])

self-supervised

Title: Self-Supervised Video Transformers for Isolated Sign Language Recognition. (arXiv:2309.02450v1 [cs.CV])

Title: A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images. (arXiv:2309.02555v1 [cs.LG])

Title: Self-Supervised Pretraining Improves Performance and Inference Efficiency in Multiple Lung Ultrasound Interpretation Tasks. (arXiv:2309.02596v1 [cs.CV])

Title: Towards Unsupervised Graph Completion Learning on Graphs with Features and Structure Missing. (arXiv:2309.02762v1 [cs.LG])

foundation model

generative

Title: Hierarchical-level rain image generative model based on GAN. (arXiv:2309.02964v1 [cs.CV])

Title: Persona-aware Generative Model for Code-mixed Language. (arXiv:2309.02915v1 [cs.CL])

Title: Enhancing Semantic Communication with Deep Generative Models -- An ICASSP Special Session Overview. (arXiv:2309.02478v1 [cs.LG])

Title: Utilizing Generative Adversarial Networks for Stable Structure Generation in Angry Birds. (arXiv:2309.02614v1 [cs.LG])

Title: Generative Algorithms for Fusion of Physics-Based Wildfire Spread Models with Satellite Data for Initializing Wildfire Forecasts. (arXiv:2309.02615v1 [cs.LG])

anomaly

Title: A Critical Review of Common Log Data Sets Used for Evaluation of Sequence-based Anomaly Detection Techniques. (arXiv:2309.02854v1 [cs.LG])

in-context

Title: Gender-specific Machine Translation with Large Language Models. (arXiv:2309.03175v1 [cs.CL])

memory

Title: Compressing Vision Transformers for Low-Resource Visual Learning. (arXiv:2309.02617v1 [cs.CV])

Our chosen application environment is an unmanned aerial vehicle (UAV) that is battery-powered and memory-constrained, carrying a single-board computer on the scale of an NVIDIA Jetson Nano with 4GB of RAM. On the other hand, the UAV requires high accuracy close to that of state-of-the-art ViTs to ensure safe object avoidance in autonomous navigation, or correct localization of humans in search-and-rescue. Inference latency should also be minimized given the application requirements. Hence, our target is to enable rapid inference of a vision transformer on an NVIDIA Jetson Nano (4GB) with minimal accuracy loss. This allows us to deploy ViTs on resource-constrained devices, opening up new possibilities in surveillance, environmental monitoring, etc. Our implementation is made available at https://github.com/chensy7/efficient-vit.

Title: Bandwidth-efficient Inference for Neural Image Compression. (arXiv:2309.02855v1 [cs.CV])

Title: FishMOT: A Simple and Effective Method for Fish Tracking Based on IoU Matching. (arXiv:2309.02975v1 [cs.CV])

Title: Mayhem: Targeted Corruption of Register and Stack Variables. (arXiv:2309.02545v1 [cs.CR])

In this work, we push the boundary and show how Rowhammer can be further exploited to inject faults into stack variables and even register values in a victim's process. We achieve this by targeting the register value that is stored in the process's stack, which subsequently is flushed out into the memory, where it becomes vulnerable to Rowhammer. When the faulty value is restored into the register, it will end up used in subsequent iterations. The register value can be stored in the stack via latent function calls in the source or by actively triggering signal handlers. We demonstrate the power of the findings by applying the techniques to bypass SUDO and SSH authentication. We further outline how MySQL and other cryptographic libraries can be targeted with the new attack vector. There are a number of challenges this work overcomes with extensive experimentation before coming together to yield an end-to-end attack on an OpenSSL digital signature: achieving co-location with stack and register variables, with synchronization provided via a blocking window. We show that stack and registers are no longer safe from the Rowhammer attack.

Title: Sparse Partitioning Around Medoids. (arXiv:2309.02557v1 [cs.LG])

Title: Unveiling Intractable Epileptogenic Brain Networks with Deep Learning Algorithms: A Novel and Comprehensive Framework for Scalable Seizure Prediction with Unimodal Neuroimaging Data in Pediatric Patients. (arXiv:2309.02580v1 [cs.LG])

Title: DECODE: Data-driven Energy Consumption Prediction leveraging Historical Data and Environmental Factors in Buildings. (arXiv:2309.02908v1 [cs.LG])

Title: CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra. (arXiv:2309.03060v1 [cs.LG])

Title: The Best Arm Evades: Near-optimal Multi-pass Streaming Lower Bounds for Pure Exploration in Multi-armed Bandits. (arXiv:2309.03145v1 [cs.LG])

few-shot

Title: Image-Object-Specific Prompt Learning for Few-Shot Class-Incremental Learning. (arXiv:2309.02833v1 [cs.CV])

Title: GRASS: Unified Generation Model for Speech Semantic Understanding. (arXiv:2309.02780v1 [cs.CL])

Title: Aligning Large Language Models for Clinical Tasks. (arXiv:2309.02884v1 [cs.CL])