diffusion

Title: DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation. (arXiv:2310.13119v1 [cs.CV])

Title: CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation. (arXiv:2310.13165v1 [cs.CV])

Title: DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics. (arXiv:2310.13268v1 [cs.CV])

Title: ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection. (arXiv:2310.13545v1 [cs.CV])

Title: Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models. (arXiv:2310.13102v1 [cs.LG])

self-supervised

Title: Multiscale Superpixel Structured Difference Graph Convolutional Network for VL Representation. (arXiv:2310.13447v1 [cs.CV])

Title: Longer-range Contextualized Masked Autoencoder. (arXiv:2310.13593v1 [cs.CV])

Title: GraphGPT: Graph Instruction Tuning for Large Language Models. (arXiv:2310.13023v1 [cs.CL])

foundation model

Title: Weakly-Supervised Semantic Segmentation with Image-Level Labels: from Traditional Models to Foundation Models. (arXiv:2310.13026v1 [cs.CV])

generative

Title: Conditional Generative Modeling for Images, 3D Animations, and Video. (arXiv:2310.13157v1 [cs.CV])

We introduce the use of Neural ODEs to model video dynamics using an encoder-decoder architecture, demonstrating their ability to predict future video frames despite being trained solely to reconstruct current frames. Next, we propose a conditional variant of continuous normalizing flows that enables higher-resolution image generation based on lower-resolution input, achieving comparable image quality while reducing parameters and training time. Our next contribution presents a pipeline that takes human images as input, automatically aligns a user-specified 3D character with the pose of the human, and facilitates pose editing based on partial inputs. Next, we derive the relevant mathematical details for denoising diffusion models that use non-isotropic Gaussian processes, and show comparable generation quality. Finally, we devise a novel denoising diffusion framework capable of solving all three video tasks of prediction, generation, and interpolation. We perform ablation studies, and show SOTA results on multiple datasets.

Our contributions are published articles at peer-reviewed venues. Overall, our research aims to make a meaningful contribution to the pursuit of more efficient and flexible generative models, with the potential to shape the future of computer vision.

Title: Generative error correction for code-switching speech recognition using large language models. (arXiv:2310.13013v1 [cs.CL])

Title: Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models. (arXiv:2310.13127v1 [cs.CL])

Title: Fast and Accurate Factual Inconsistency Detection Over Long Documents. (arXiv:2310.13189v1 [cs.CL])

Title: Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models. (arXiv:2310.13315v1 [cs.CL])

Title: Cache & Distil: Optimising API Calls to Large Language Models. (arXiv:2310.13561v1 [cs.CL])

Title: A Distributed Approach to Meteorological Predictions: Addressing Data Imbalance in Precipitation Prediction Models through Federated Learning and GANs. (arXiv:2310.13161v1 [cs.LG])

Title: DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming with Feasibility Guarantee. (arXiv:2310.13261v1 [cs.LG])

Title: Learning Recurrent Models with Temporally Local Rules. (arXiv:2310.13284v1 [cs.LG])

Title: Y-Diagonal Couplings: Approximating Posteriors with Conditional Wasserstein Distances. (arXiv:2310.13433v1 [cs.LG])

Title: Stable Nonconvex-Nonconcave Training via Linear Interpolation. (arXiv:2310.13459v1 [cs.LG])

anomaly

Title: Anomaly Detection of Command Shell Sessions based on DistilBERT: Unsupervised and Supervised Approaches. (arXiv:2310.13247v1 [cs.CL])

Title: FLTracer: Accurate Poisoning Attack Provenance in Federated Learning. (arXiv:2310.13424v1 [cs.CR])

Title: Positive-Unlabeled Node Classification with Structure-aware Graph Learning. (arXiv:2310.13538v1 [cs.LG])

in-context

Title: A Simple Baseline for Knowledge-Based Visual Question Answering. (arXiv:2310.13570v1 [cs.CV])

Title: Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning. (arXiv:2310.13448v1 [cs.CL])

Title: Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning. (arXiv:2310.13486v1 [cs.CL])

Title: Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning. (arXiv:2310.13552v1 [cs.CL])

Title: In-context Learning with Transformer Is Really Equivalent to a Contrastive Learning Pattern. (arXiv:2310.13220v1 [cs.LG])

memory

Title: What you see is what you get: Experience ranking with deep neural dataset-to-dataset similarity for topological localisation. (arXiv:2310.13622v1 [cs.CV])

Title: Learning Successor Representations with Distributed Hebbian Temporal Memory. (arXiv:2310.13391v1 [cs.LG])

few-shot

Title: CXR-CLIP: Toward Large Scale Chest X-ray Language-Image Pre-training. (arXiv:2310.13292v1 [cs.CV])

Title: APP: Adaptive Prototypical Pseudo-Labeling for Few-shot OOD Detection. (arXiv:2310.13380v1 [cs.CL])

Title: Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models. (arXiv:2310.13395v1 [cs.CL])

Title: Improving Cross-Lingual Transfer through Subtree-Aware Word Reordering. (arXiv:2310.13583v1 [cs.CL])

Title: Unsupervised Representation Learning to Aid Semi-Supervised Meta Learning. (arXiv:2310.13085v1 [cs.LG])