diffusion

Title: Boosting Diffusion Models with an Adaptive Momentum Sampler. (arXiv:2308.11941v1 [cs.CV])

Title: LongDanceDiff: Long-term Dance Generation with Conditional Diffusion Model. (arXiv:2308.11945v1 [cs.CV])

Title: Efficient Transfer Learning in Diffusion Models via Adversarial Noise. (arXiv:2308.11948v1 [cs.CV])

Title: High-quality Image Dehazing with Diffusion Model. (arXiv:2308.11949v1 [cs.CV])

Title: Manipulating Embeddings of Stable Diffusion Prompts. (arXiv:2308.12059v1 [cs.CV])

Title: Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning. (arXiv:2308.12219v1 [cs.CL])

Title: Revolutionizing TCAD Simulations with Universal Device Encoding and Graph Attention Networks. (arXiv:2308.11624v1 [cs.LG])

Title: Shape-conditioned 3D Molecule Generation via Equivariant Diffusion Models. (arXiv:2308.11890v1 [cs.LG])

Title: On-Manifold Projected Gradient Descent. (arXiv:2308.12279v1 [cs.LG])

self-supervised

Title: An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning. (arXiv:2308.11677v1 [cs.LG])

Title: WS-SfMLearner: Self-supervised Monocular Depth and Ego-motion Estimation on Surgical Videos with Unknown Camera Parameters. (arXiv:2308.11776v1 [cs.CV])

Title: Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations. (arXiv:2308.11796v1 [cs.CV])

Title: Semantic-Aware Implicit Template Learning via Part Deformation Consistency. (arXiv:2308.11916v1 [cs.CV])

Title: Head-Tail Cooperative Learning Network for Unbiased Scene Graph Generation. (arXiv:2308.12048v1 [cs.CV])

Title: CHORUS: Learning Canonicalized 3D Human-Object Spatial Relations from Unbounded Synthesized Images. (arXiv:2308.12288v1 [cs.CV])

foundation model

Title: EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE. (arXiv:2308.11971v1 [cs.CV])

Title: Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment. (arXiv:2308.12001v1 [cs.CV])

generative

Title: Weakly Supervised Face and Whole Body Recognition in Turbulent Environments. (arXiv:2308.11757v1 [cs.CV])

Title: CoC-GAN: Employing Context Cluster for Unveiling a New Pathway in Image Generation. (arXiv:2308.11857v1 [cs.CV])

Title: A Probabilistic Fluctuation based Membership Inference Attack for Generative Models. (arXiv:2308.12143v1 [cs.LG])

Title: A Generative Approach for Image Registration of Visible-Thermal (VT) Cancer Faces. (arXiv:2308.12271v1 [cs.CV])

Title: Exploring the Effectiveness of GPT Models in Test-Taking: A Case Study of the Driver's License Knowledge Test. (arXiv:2308.11827v1 [cs.CL])

Title: How to Protect Copyright Data in Optimization of Large Language Models?. (arXiv:2308.12247v1 [cs.LG])

In this paper, we show that large language model training and optimization can be seen as a softmax regression problem. We then establish a method of efficiently performing softmax regression, in a way that prevents the regression function from generating copyright data. This establishes a theoretical method of training large language models in a way that avoids generating copyright data.

Title: Maintaining Plasticity via Regenerative Regularization. (arXiv:2308.11958v1 [cs.LG])

Title: Will More Expressive Graph Neural Networks do Better on Generative Tasks?. (arXiv:2308.11978v1 [cs.LG])

Title: How Safe Am I Given What I See? Calibrated Prediction of Safety Chances for Image-Controlled Autonomy. (arXiv:2308.12252v1 [cs.LG])

anomaly

Title: VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection. (arXiv:2308.11681v1 [cs.CV])

Title: Exploring the Optimization Objective of One-Class Classification for Anomaly Detection. (arXiv:2308.11898v1 [cs.CV])

Title: Few-shot Anomaly Detection in Text with Deviation Learning. (arXiv:2308.11780v1 [cs.LG])

Title: Performance Comparison and Implementation of Bayesian Variants for Network Intrusion Detection. (arXiv:2308.11834v1 [cs.LG])

Title: Class Label-aware Graph Anomaly Detection. (arXiv:2308.11669v1 [cs.LG])

in-context

memory

Title: 3ET: Efficient Event-based Eye Tracking using a Change-Based ConvLSTM Network. (arXiv:2308.11771v1 [cs.CV])

Title: Less is More -- Towards parsimonious multi-task models using structured sparsity. (arXiv:2308.12114v1 [cs.CV])

Title: Cabrita: closing the gap for foreign languages. (arXiv:2308.11878v1 [cs.CL])

The main solution to overcome the cost challenge is to rely on available pre-trained models, which, despite recent advancements such as the LLaMA and LLaMA-2 models, still demonstrate inefficiency for certain specific domain problems or prove ineffective in scenarios involving conversational memory resources, given the large number of tokens required to represent text.

To overcome this issue, we present a methodology named Cabrita, which, as our research demonstrates, successfully addresses the performance and efficient tokenization problem, all at an affordable cost. We believe that this methodology can be applied to any transformer-like architecture model. To validate the study, we conducted continuous pre-training exclusively using Portuguese text on a 3-billion-parameter model known as OpenLLaMA, resulting in a model named openCabrita 3B. The openCabrita 3B also features a new tokenizer that results in a significant reduction in the number of tokens required to represent the text. In our assessment, for few-shot learning tasks, we achieved similar results with this 3B model compared to a traditional continuous pre-training approach as well as to 7B models English pre-trained models.

Title: Reranking Passages with Coarse-to-Fine Neural Retriever using List-Context Information. (arXiv:2308.12022v1 [cs.CL])

Title: PARseL: Towards a Verified Root-of-Trust over seL4. (arXiv:2308.11921v1 [cs.CR])

For higher-end devices, RA is achievable via secure hardware components. For low-end (bare metal) devices, minimalistic hybrid (hardware/software) RA is effective, which incurs some hardware modifications. That leaves certain mid-range devices (e.g., ARM Cortex-A family) equipped with standard hardware components, e.g., a memory management unit (MMU) and perhaps a secure boot facility. In this space, seL4 (a verified microkernel with guaranteed process isolation) is a promising platform for attaining RA. HYDRA made a first step towards this, albeit without achieving any verifiability or provable guarantees.

This paper picks up where HYDRA left off by constructing a PARseL architecture, that separates all user-dependent components from the TCB. This leads to much stronger isolation guarantees, based on seL4 alone, and facilitates formal verification. In PARseL, We use formal verification to obtain several security properties for the isolated RA TCB, including: memory safety, functional correctness, and secret independence. We implement PARseL in F* and specify/prove expected properties using Hoare logic. Next, we automatically translate the F* implementation to C using KaRaMeL, which preserves verified properties of PARseL C implementation (atop seL4). Finally, we instantiate and evaluate PARseL on a commodity platform -- a SabreLite embedded device.

Title: CACTUS: a Comprehensive Abstraction and Classification Tool for Uncovering Structures. (arXiv:2308.12031v1 [cs.LG])

Title: Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference. (arXiv:2308.12066v1 [cs.LG])

Title: Cached Operator Reordering: A Unified View for Fast GNN Training. (arXiv:2308.12093v1 [cs.LG])

few-shot

Title: Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts. (arXiv:2308.11793v1 [cs.CV])

Title: LFS-GAN: Lifelong Few-Shot Image Generation. (arXiv:2308.11917v1 [cs.CV])

Title: Knowledge-injected Prompt Learning for Chinese Biomedical Entity Normalization. (arXiv:2308.12025v1 [cs.CL])

Title: FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering. (arXiv:2308.12060v1 [cs.CL])

Title: Prompt2Model: Generating Deployable Models from Natural Language Instructions. (arXiv:2308.12261v1 [cs.CL])