diffusion

Title: Unified Concept Editing in Diffusion Models. (arXiv:2308.14761v1 [cs.CV])

Title: C2G2: Controllable Co-speech Gesture Generation with Latent Diffusion Model. (arXiv:2308.15016v1 [cs.CV])

Title: DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior. (arXiv:2308.15070v1 [cs.CV])

Title: DiffusionVMR: Diffusion Model for Video Moment Retrieval. (arXiv:2308.15109v1 [cs.CV])

Title: Elucidating the Exposure Bias in Diffusion Models. (arXiv:2308.15321v1 [cs.LG])

Title: ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer. (arXiv:2308.15459v1 [cs.CL])

Title: Generating tabular datasets under differential privacy. (arXiv:2308.14784v1 [cs.LG])

self-supervised

Title: Exploring Model Transferability through the Lens of Potential Energy. (arXiv:2308.15074v1 [cs.CV])

Title: Detect, Augment, Compose, and Adapt: Four Steps for Unsupervised Domain Adaptation in Object Detection. (arXiv:2308.15353v1 [cs.CV])

Title: A General-Purpose Self-Supervised Model for Computational Pathology. (arXiv:2308.15474v1 [cs.CV])

Title: Neural approaches to spoken content embedding. (arXiv:2308.14905v1 [cs.CL])

In this thesis, we contribute new discriminative acoustic word embedding (AWE) and acoustically grounded word embedding (AGWE) approaches based on recurrent neural networks (RNNs). We improve model training in terms of both efficiency and performance. We take these developments beyond English to several low-resource languages and show that multilingual training improves performance when labeled data is limited. We apply our embedding models, both monolingual and multilingual, to the downstream tasks of query-by-example speech search and automatic speech recognition. Finally, we show how our embedding approaches compare with and complement more recent self-supervised speech models.

foundation model

Title: Auto-Prompting SAM for Mobile Friendly 3D Medical Image Segmentation. (arXiv:2308.14936v1 [cs.CV])

Title: Reprogramming under constraints: Revisiting efficient and reliable transferability of lottery tickets. (arXiv:2308.14969v1 [cs.LG])

Title: Efficient Model Personalization in Federated Learning via Client-Specific Prompt Generation. (arXiv:2308.15367v1 [cs.CV])

generative

Title: CLNeRF: Continual Learning Meets NeRF. (arXiv:2308.14816v1 [cs.CV])

Title: CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation. (arXiv:2308.15226v1 [cs.CV])

Title: Learning Modulated Transformation in GANs. (arXiv:2308.15472v1 [cs.CV])

Title: MadSGM: Multivariate Anomaly Detection with Score-based Generative Models. (arXiv:2308.15069v1 [cs.LG])

anomaly

Title: Evaluation of Key Spatiotemporal Learners for Print Track Anomaly Classification Using Melt Pool Image Streams. (arXiv:2308.14861v1 [cs.LG])

Title: ADFA: Attention-augmented Differentiable top-k Feature Adaptation for Unsupervised Medical Anomaly Detection. (arXiv:2308.15280v1 [cs.CV])

Title: MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection. (arXiv:2308.15300v1 [cs.CV])

Although the absence of anomalous samples and annotations deteriorates the UAD performance, an inconspicuous yet powerful statistics model, the normalizing flows, is appropriate for anomaly detection and localization in an unsupervised fashion. The flow-based probabilistic models, only trained on anomaly-free data, can efficiently distinguish unpredictable anomalies by assigning them much lower likelihoods than normal data.

Nevertheless, the size variation of unpredictable anomalies introduces another inconvenience to the flow-based methods for high-precision anomaly detection and localization. To generalize the anomaly size variation, we propose a novel Multi-Scale Flow-based framework dubbed MSFlow composed of asymmetrical parallel flows followed by a fusion flow to exchange multi-scale perceptions. Moreover, different multi-scale aggregation strategies are adopted for image-wise anomaly detection and pixel-wise anomaly localization according to the discrepancy between them. The proposed MSFlow is evaluated on three anomaly detection datasets, significantly outperforming existing methods. Notably, on the challenging MVTec AD benchmark, our MSFlow achieves a new state-of-the-art with a detection AUORC score of up to 99.7%, localization AUCROC score of 98.8%, and PRO score of 97.1%. The reproducible code is available at https://github.com/cool-xuan/msflow.

Title: AnomalyGPT: Detecting Industrial Anomalies using Large Vision-Language Models. (arXiv:2308.15366v1 [cs.CV])

Title: Assessing Cyclostationary Malware Detection via Feature Selection and Classification. (arXiv:2308.15237v1 [cs.CR])

Title: Tackling Diverse Minorities in Imbalanced Classification. (arXiv:2308.14838v1 [cs.LG])

in-context

memory

Title: Learning to Upsample by Learning to Sample. (arXiv:2308.15085v1 [cs.CV])

Title: MEMORY-VQ: Compression for Tractable Internet-Scale Memory. (arXiv:2308.14903v1 [cs.CL])

We propose MEMORY-VQ, a new method to reduce storage requirements of memory-augmented models without sacrificing performance. Our method uses a vector quantization variational autoencoder (VQ-VAE) to compress token representations. We apply MEMORY-VQ to the LUMEN model to obtain LUMEN-VQ, a memory model that achieves a 16x compression rate with comparable performance on the KILT benchmark. LUMEN-VQ enables practical retrieval augmentation even for extremely large retrieval corpora.

Title: Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models. (arXiv:2308.15022v1 [cs.CL])

Title: Scalable and Configurable Tracking for Any Rowhammer Threshold. (arXiv:2308.14889v1 [cs.CR])

To that end, we propose START - a Scalable Tracker for Any Rowhammer Threshold. Rather than relying on dedicated SRAM structures, START dynamically repurposes a small fraction the Last-Level Cache (LLC) to store tracking metadata. START is based on the observation that while the memory contains millions of rows, typical workloads touch only a small subset of rows within a refresh period of 64ms, so allocating tracking entries on demand significantly reduces storage. If the application does not access many rows in memory, START does not reserve any LLC capacity. Otherwise, START dynamically uses 1-way, 2-way, or 8-way of the cache set based on demand. START consumes, on average, 9.4% of the LLC capacity to store metadata, which is 5X lower compared to dedicating a counter in LLC for each row in memory. We also propose START-M, a memory-mapped START for large-memory systems. Our designs require only 4KB SRAM for newly added structures and perform within 1% of idealized tracking even at TRH of less than 100.

Title: Randomized Line-to-Row Mapping for Low-Overhead Rowhammer Mitigations. (arXiv:2308.14907v1 [cs.CR])

Our paper provides the key insights that benign application encounter thousands of hot rows (receiving more activations than the threshold) due to the memory mapping, which places spatially proximate lines in the same row to maximize row-buffer hitrate. Unfortunately, this causes row to receive activations for many frequently used lines. We propose Rubix, which breaks the spatial correlation in the line-to-row mapping by using an encrypted address to access the memory, reducing the likelihood of hot rows by 2 to 3 orders of magnitude. To aid row-buffer hits, Rubix randomizes a group of 1-4 lines. We also propose Rubix-D, which dynamically changes the line-to-row mapping. Rubix-D minimizes hot-rows and makes it much harder for an adversary to learn the spatial neighbourhood of a row. Rubix reduces the slowdown of AQUA (from 15% to 1%), SRS (from 60% to 2%), and Blockhammer (from 600% to 3%) while incurring a storage of less than 1 Kilobyte.

Title: A Closer Look at the Security Risks in the Rust Ecosystem. (arXiv:2308.15046v1 [cs.CR])

In this paper, we perform a comprehensive investigation into the security risks present in the Rust ecosystem, asking ``what are the characteristics of the vulnerabilities, what are the characteristics of the vulnerable packages, and how are the vulnerabilities fixed in practice?''. To facilitate the study, we first compile a dataset of 433 vulnerabilities, 300 vulnerable code repositories, and 218 vulnerability fix commits in the Rust ecosystem, spanning over 7 years. With the dataset, we characterize the types, life spans, and evolution of the disclosed vulnerabilities. We then characterize the popularity, categorization, and vulnerability density of the vulnerable Rust packages, as well as their versions and code regions affected by the disclosed vulnerabilities. Finally, we characterize the complexity of vulnerability fixes and localities of corresponding code changes, and inspect how practitioners fix vulnerabilities in Rust packages with various localities.

Title: SMOClust: Synthetic Minority Oversampling based on Stream Clustering for Evolving Data Streams. (arXiv:2308.14845v1 [cs.LG])

Title: Streaming Compression of Scientific Data via weak-SINDy. (arXiv:2308.14962v1 [cs.LG])

Title: Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence. (arXiv:2308.14991v1 [cs.LG])

Title: On-Device Learning with Binary Neural Networks. (arXiv:2308.15308v1 [cs.LG])

few-shot

Title: When hard negative sampling meets supervised contrastive learning. (arXiv:2308.14893v1 [cs.CV])

Title: Read-only Prompt Optimization for Vision-Language Few-shot Learning. (arXiv:2308.14960v1 [cs.CV])

Title: Few-Shot Object Detection via Synthetic Features with Optimal Transport. (arXiv:2308.15005v1 [cs.CV])

Title: TransPrompt v2: A Transferable Prompting Framework for Cross-task Text Classification. (arXiv:2308.15010v1 [cs.CL])

Title: Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering. (arXiv:2308.15231v1 [cs.CL])