[[2302.12994] Secure End-to-End Communications with Lightweight Cryptographic Algorithm](http://arxiv.org/abs/2302.12994) #secure
The field of lightweight cryptography has been gaining popularity as traditional cryptographic techniques are challenging to implement in resource-limited environments. This research paper presents an approach to utilizing the ESP32 microcontroller as a hardware platform to implement a lightweight cryptographic algorithm. Our approach employs KATAN32, the smallest block cipher of the KATAN family, with an 80-bit key and 32-bit blocks. The algorithm requires less computational power as it employs an 80 unsigned 64-bit integer key for encrypting and decrypting data. During encryption, a data array is passed into the encryption function with a key, which is then used to fill a buffer with an encrypted array. Similarly, the decryption function utilizes a buffer to fill an array of original data in 32 unsigned 64-bit integers. This study also investigates the optimal implementation of cryptography block ciphers, benchmarking performance against various metrics, including memory requirements (RAM), throughput, power consumption, and security. Our implementation demonstrates that data can be securely transmitted end-to-end with good throughput and low power consumption.
[[2302.12867] On homomorphic encryption using abelian groups: Classical security analysis](http://arxiv.org/abs/2302.12867) #security
In [15], Leonardi and Ruiz-Lopez propose an additively homomorphic public key encryption scheme whose security is expected to depend on the hardness of the learning homomorphism with noise problem (LHN). Choosing parameters for their primitive requires choosing three groups $G$, $H$, and $K$. In their paper, Leonardi and Ruiz-Lopez claim that, when $G$, $H$, and $K$ are abelian, then their public key cryptosystem is not quantum secure. In this paper, we study security for finite abelian groups $G$, $H$, and $K$ in the classical case. Moreover, we study quantum attacks on instantiations with solvable groups.
[[2302.12959] Chaotic Variational Auto encoder-based Adversarial Machine Learning](http://arxiv.org/abs/2302.12959) #security
Machine Learning (ML) has become the new contrivance in almost every field. This makes them a target of fraudsters by various adversary attacks, thereby hindering the performance of ML models. Evasion and Data-Poison-based attacks are well acclaimed, especially in finance, healthcare, etc. This motivated us to propose a novel computationally less expensive attack mechanism based on the adversarial sample generation by Variational Auto Encoder (VAE). It is well known that Wavelet Neural Network (WNN) is considered computationally efficient in solving image and audio processing, speech recognition, and time-series forecasting. This paper proposed VAE-Deep-Wavelet Neural Network (VAE-Deep-WNN), where Encoder and Decoder employ WNN networks. Further, we proposed chaotic variants of both VAE with Multi-layer perceptron (MLP) and Deep-WNN and named them C-VAE-MLP and C-VAE-Deep-WNN, respectively. Here, we employed a Logistic map to generate random noise in the latent space. In this paper, we performed VAE-based adversary sample generation and applied it to various problems related to finance and cybersecurity domain-related problems such as loan default, credit card fraud, and churn modelling, etc., We performed both Evasion and Data-Poison attacks on Logistic Regression (LR) and Decision Tree (DT) models. The results indicated that VAE-Deep-WNN outperformed the rest in the majority of the datasets and models. However, its chaotic variant C-VAE-Deep-WNN performed almost similarly to VAE-Deep-WNN in the majority of the datasets.
[[2302.13082] A Threat-Intelligence Driven Methodology to Incorporate Uncertainty in Cyber Risk Analysis and Enhance Decision Making](http://arxiv.org/abs/2302.13082) #security
The predictability and understandability of the world around us is limited, and many events are uncertain. It can be difficult to make decisions in these uncertain circumstances, as demonstrated by the changing measures taken to contain the COVID-19 pandemic. These decisions are not necessarily incorrect, but rather a reflection of the difficulty of decision making under uncertainty, where the probability and impact of events and measures are unknown. Information security is rapidly positioning itself around making decisions in uncertain situations. Which means that, it is not just about preventing or managing probable risks, but rather about dealing with unpredictable probabilities and effects. To contend with, information security leaders should therefore include strategies that reduce uncertainty and hence improve the quality of decision making. Risk assessment is a principal element of evidence-based decision making, especially in an ever-changing cyber threat landscape that constantly introduces uncertainties. Thus, it is essential to recognize that addressing uncertainty requires a new methodology and risk analysis approach that considers both known unknowns and unknown unknowns. To address this challenge, we propose the threat-intelligence based security assessment, and discuss a decision-making strategy under uncertainty, both of which support decision makers in this complex undertaking.
[[2302.13156] Why Do Deepfake Detectors Fail?](http://arxiv.org/abs/2302.13156) #security
Recent rapid advancements in deepfake technology have allowed the creation of highly realistic fake media, such as video, image, and audio. These materials pose significant challenges to human authentication, such as impersonation, misinformation, or even a threat to national security. To keep pace with these rapid advancements, several deepfake detection algorithms have been proposed, leading to an ongoing arms race between deepfake creators and deepfake detectors. Nevertheless, these detectors are often unreliable and frequently fail to detect deepfakes. This study highlights the challenges they face in detecting deepfakes, including (1) the pre-processing pipeline of artifacts and (2) the fact that generators of new, unseen deepfake samples have not been considered when building the defense models. Our work sheds light on the need for further research and development in this field to create more robust and reliable detectors.
[[2302.13161] Cybersecurity Challenges of Power Transformers](http://arxiv.org/abs/2302.13161) #security
The rise of cyber threats on critical infrastructure and its potential for devastating consequences, has significantly increased. The dependency of new power grid technology on information, data analytic and communication systems make the entire electricity network vulnerable to cyber threats. Power transformers play a critical role within the power grid and are now commonly enhanced through factory add-ons or intelligent monitoring systems added later to improve the condition monitoring of critical and long lead time assets such as transformers. However, the increased connectivity of those power transformers opens the door to more cyber attacks. Therefore, the need to detect and prevent cyber threats is becoming critical. The first step towards that would be a deeper understanding of the potential cyber-attacks landscape against power transformers. Much of the existing literature pays attention to smart equipment within electricity distribution networks, and most methods proposed are based on model-based detection algorithms. Moreover, only a few of these works address the security vulnerabilities of power elements, especially transformers within the transmission network. To the best of our knowledge, there is no study in the literature that systematically investigate the cybersecurity challenges against the newly emerged smart transformers. This paper addresses this shortcoming by exploring the vulnerabilities and the attack vectors of power transformers within electricity networks, the possible attack scenarios and the risks associated with these attacks.
[[2302.12935] Visual Privacy: Current and Emerging Regulations Around Unconsented Video Analytics in Retail](http://arxiv.org/abs/2302.12935) #privacy
Video analytics is the practice of combining digital video data with machine learning models to infer various characteristics from that video. This capability has been used for years to detect objects, movement, and the number of customers in physical retail stores, but more complex machine learning models combined with more powerful computing power has unlocked new levels of possibility. Researchers claim it is now possible to infer a whole host of characteristics about an individual using video analytics, such as specific age, ethnicity, health status and emotional state. Moreover, an individuals visual identity can be augmented with information from other data providers to build out a detailed profile, all with the individual unknowingly contributing their physical presence in front of a retail store camera. Some retailers have begun to experiment with this new technology as a way to better know their customers. However, those same early adopters are caught in an evolving legal landscape around privacy and data ownership. This research looks into the current legal landscape and legislation currently in progress around the use of video analytics, specifically in the retail store setting. Because the ethical and legal norms around individualized video analytics are still heavily in flux, retailers are urged to adopt a wait and see approach or potentially incur costly legal expenses and risk damage to their brand.
[[2302.12909] Differentially Private Algorithms for the Stochastic Saddle Point Problem with Optimal Rates for the Strong Gap](http://arxiv.org/abs/2302.12909) #privacy
We show that convex-concave Lipschitz stochastic saddle point problems (also known as stochastic minimax optimization) can be solved under the constraint of $(\epsilon,\delta)$-differential privacy with \emph{strong (primal-dual) gap} rate of $\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$, where $n$ is the dataset size and $d$ is the dimension of the problem. This rate is nearly optimal, based on existing lower bounds in differentially private stochastic optimization. Specifically, we prove a tight upper bound on the strong gap via novel implementation and analysis of the recursive regularization technique repurposed for saddle point problems. We show that this rate can be attained with $O\big(\min\big{\frac{n^2\epsilon^{1.5}}{\sqrt{d}}, n^{3/2}\big}\big)$ gradient complexity, and $O(n)$ gradient complexity if the loss function is smooth. As a byproduct of our method, we develop a general algorithm that, given a black-box access to a subroutine satisfying a certain $\alpha$ primal-dual accuracy guarantee with respect to the empirical objective, gives a solution to the stochastic saddle point problem with a strong gap of $\tilde{O}(\alpha+\frac{1}{\sqrt{n}})$. We show that this $\alpha$-accuracy condition is satisfied by standard algorithms for the empirical saddle point problem such as the proximal point method and the stochastic gradient descent ascent algorithm. Further, we show that even for simple problems it is possible for an algorithm to have zero weak gap and suffer from $\Omega(1)$ strong gap. We also show that there exists a fundamental tradeoff between stability and accuracy. Specifically, we show that any $\Delta$-stable algorithm has empirical gap $\Omega\big(\frac{1}{\Delta n}\big)$, and that this bound is tight. This result also holds also more specifically for empirical risk minimization problems and may be of independent interest.
[[2302.13079] Privacy-Preserving Electricity Theft Detection based on Blockchain](http://arxiv.org/abs/2302.13079) #privacy
In most electricity theft detection schemes, consumers' power consumption data is directly input into the detection center. Although it is valid in detecting the theft of consumers, the privacy of all consumers is at risk unless the detection center is assumed to be trusted. In fact, it is impractical. Moreover, existing schemes may result in some security problems, such as the collusion attack due to the presence of a trusted third party, and malicious data tampering caused by the system operator (SO) being attacked. Aiming at the problems above, we propose a blockchain-based privacy-preserving electricity theft detection scheme without a third party. Specifically, the proposed scheme uses an improved functional encryption scheme to enable electricity theft detection and load monitoring while preserving consumers' privacy; distributed storage of consumers' data with blockchain to resolve security problems such as data tampering, etc. Meanwhile, we build a long short-term memory network (LSTM) model to perform higher accuracy for electricity theft detection. The proposed scheme is evaluated in a real environment, and the results show that it is more accurate in electricity theft detection within acceptable communication and computational overhead. Our system analysis demonstrates that the proposed scheme can resist various security attacks and preserve consumers' privacy.
[[2302.13053] RETEXO: Scalable Neural Network Training over Distributed Graphs](http://arxiv.org/abs/2302.13053) #privacy
Graph neural networks offer a promising approach to supervised learning over graph data. Graph data, especially when it is privacy-sensitive or too large to train on centrally, is often stored partitioned across disparate processing units (clients) which want to minimize the communication costs during collaborative training. The fully-distributed setup takes such partitioning to its extreme, wherein features of only a single node and its adjacent edges are kept locally with one client processor. Existing GNNs are not architected for training in such setups and incur prohibitive costs therein. We propose RETEXO, a novel transformation of existing GNNs that improves the communication efficiency during training in the fully-distributed setup. We experimentally confirm that RETEXO offers up to 6 orders of magnitude better communication efficiency even when training shallow GNNs, with a minimal trade-off in accuracy for supervised node classification tasks.
[[2302.13056] SATBA: An Invisible Backdoor Attack Based On Spatial Attention](http://arxiv.org/abs/2302.13056) #attack
As a new realm of AI security, backdoor attack has drew growing attention research in recent years. It is well known that backdoor can be injected in a DNN model through the process of model training with poisoned dataset which is consist of poisoned sample. The injected model output correct prediction on benign samples yet behave abnormally on poisoned samples included trigger pattern. Most existing trigger of poisoned sample are visible and can be easily found by human visual inspection, and the trigger injection process will cause the feature loss of natural sample and trigger. To solve the above problems and inspire by spatial attention mechanism, we introduce a novel backdoor attack named SATBA, which is invisible and can minimize the loss of trigger to improve attack success rate and model accuracy. It extracts data features and generate trigger pattern related to clean data through spatial attention, poisons clean image by using a U-type models to plant a trigger into the original data. We demonstrate the effectiveness of our attack against three popular image classification DNNs on three standard datasets. Besides, we conduct extensive experiments about image similarity to show that our proposed attack can provide practical stealthiness which is critical to resist to backdoor defense.
[[2302.12890] Edge-Based Detection and Localization of Adversarial Oscillatory Load Attacks Orchestrated By Compromised EV Charging Stations](http://arxiv.org/abs/2302.12890) #attack
In this paper, we investigate an edge-based approach for the detection and localization of coordinated oscillatory load attacks initiated by exploited EV charging stations against the power grid. We rely on the behavioral characteristics of the power grid in the presence of interconnected EVCS while combining cyber and physical layer features to implement deep learning algorithms for the effective detection of oscillatory load attacks at the EVCS. We evaluate the proposed detection approach by building a real-time test bed to synthesize benign and malicious data, which was generated by analyzing real-life EV charging data collected during recent years. The results demonstrate the effectiveness of the implemented approach with the Convolutional Long-Short Term Memory model producing optimal classification accuracy (99.4\%). Moreover, our analysis results shed light on the impact of such detection mechanisms towards building resiliency into different levels of the EV charging ecosystem while allowing power grid operators to localize attacks and take further mitigation measures. Specifically, we managed to decentralize the detection mechanism of oscillatory load attacks and create an effective alternative for operator-centric mechanisms to mitigate multi-operator and MitM oscillatory load attacks against the power grid. Finally, we leverage the created test bed to evaluate a distributed mitigation technique, which can be deployed on public/private charging stations to average out the impact of oscillatory load attacks while allowing the power system to recover smoothly within 1 second with minimal overhead.
[[2302.12967] Temporal-Channel Topology Enhanced Network for Skeleton-Based Action Recognition](http://arxiv.org/abs/2302.12967) #robust
Skeleton-based action recognition has become popular in recent years due to its efficiency and robustness. Most current methods adopt graph convolutional network (GCN) for topology modeling, but GCN-based methods are limited in long-distance correlation modeling and generalizability. In contrast, the potential of convolutional neural network (CNN) for topology modeling has not been fully explored. In this paper, we propose a novel CNN architecture, Temporal-Channel Topology Enhanced Network (TCTE-Net), to learn spatial and temporal topologies for skeleton-based action recognition. The TCTE-Net consists of two modules: the Temporal-Channel Focus module, which learns a temporal-channel focus matrix to identify the most critical feature representations, and the Dynamic Channel Topology Attention module, which dynamically learns spatial topological features, and fuses them with an attention mechanism to model long-distance channel-wise topology. We conduct experiments on NTU RGB+D, NTU RGB+D 120, and FineGym datasets. TCTE-Net shows state-of-the-art performance compared to CNN-based methods and achieves superior performance compared to GCN-based methods. The code is available at https://github.com/aikuniverse/TCTE-Net.
[[2302.13094] Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction](http://arxiv.org/abs/2302.13094) #robust
Monitoring sustainable development goals requires accurate and timely socioeconomic statistics, while ubiquitous and frequently-updated urban imagery in web like satellite/street view images has emerged as an important source for socioeconomic prediction. Especially, recent studies turn to self-supervised contrastive learning with manually designed similarity metrics for urban imagery representation learning and further socioeconomic prediction, which however suffers from effectiveness and robustness issues. To address such issues, in this paper, we propose a Knowledge-infused Contrastive Learning (KnowCL) model for urban imagery-based socioeconomic prediction. Specifically, we firstly introduce knowledge graph (KG) to effectively model the urban knowledge in spatiality, mobility, etc., and then build neural network based encoders to learn representations of an urban image in associated semantic and visual spaces, respectively. Finally, we design a cross-modality based contrastive learning framework with a novel image-KG contrastive loss, which maximizes the mutual information between semantic and visual representations for knowledge infusion. Extensive experiments of applying the learnt visual representations for socioeconomic prediction on three datasets demonstrate the superior performance of KnowCL with over 30\% improvements on $R^2$ compared with baselines. Especially, our proposed KnowCL model can apply to both satellite and street imagery with both effectiveness and transferability achieved, which provides insights into urban imagery-based socioeconomic prediction.
[[2302.12952] Robust language-based mental health assessments in time and space through social media](http://arxiv.org/abs/2302.12952) #robust
Compared to physical health, population mental health measurement in the U.S. is very coarse-grained. Currently, in the largest population surveys, such as those carried out by the Centers for Disease Control or Gallup, mental health is only broadly captured through "mentally unhealthy days" or "sadness", and limited to relatively infrequent state or metropolitan estimates. Through the large scale analysis of social media data, robust estimation of population mental health is feasible at much higher resolutions, up to weekly estimates for counties. In the present work, we validate a pipeline that uses a sample of 1.2 billion Tweets from 2 million geo-located users to estimate mental health changes for the two leading mental health conditions, depression and anxiety. We find moderate to large associations between the language-based mental health assessments and survey scores from Gallup for multiple levels of granularity, down to the county-week (fixed effects $\beta = .25$ to $1.58$; $p<.001$). Language-based assessment allows for the cost-effective and scalable monitoring of population mental health at weekly time scales. Such spatially fine-grained time series are well suited to monitor effects of societal events and policies as well as enable quasi-experimental study designs in population health and other disciplines. Beyond mental health in the U.S., this method generalizes to a broad set of psychological outcomes and allows for community measurement in under-resourced settings where no traditional survey measures - but social media data - are available.
[[2302.13114] Sequential Query Encoding For Complex Query Answering on Knowledge Graphs](http://arxiv.org/abs/2302.13114) #robust
Query encoding (QE) is proposed as a fast and robust solution to CQA. In the encoding process, most existing QE methods first parse the logical query into an executable computational direct-acyclic graph (DAG), then use neural networks to parameterize the operators, and finally, recursively execute these neuralized operators. However, the parameterization-and-execution paradigm may be potentially over-complicated, as it can be structurally simplified by a single neural network encoder. Meanwhile, sequence encoders, like LSTM and Transformer, proved to be effective for encoding semantic graphs in related tasks. Motivated by this, we propose sequential query encoding (SQE) as an alternative to encode queries for CQA. Instead of parameterizing and executing the computational graph, SQE first uses a search-based algorithm to linearize the computational graph to a sequence of tokens and then uses a sequence encoder to compute its vector representation. Then this vector representation is used as a query embedding to retrieve answers from the embedding space according to similarity scores. Despite its simplicity, SQE demonstrates state-of-the-art neural query encoding performance on FB15k, FB15k-237, and NELL on an extended benchmark including twenty-nine types of in-distribution queries. Further experiment shows that SQE also demonstrates comparable knowledge inference capability on out-of-distribution queries, whose query types are not observed during the training process.
[[2302.13049] CASIA-Iris-Africa: A Large-scale African Iris Image Database](http://arxiv.org/abs/2302.13049) #biometric
Iris biometrics is a phenotypic biometric trait that has proven to be agnostic to human natural physiological changes. Research on iris biometrics has progressed tremendously, partly due to publicly available iris databases. Various databases have been available to researchers that address pressing iris biometric challenges such as constraint, mobile, multispectral, synthetics, long-distance, contact lenses, liveness detection, etc. However, these databases mostly contain subjects of Caucasian and Asian docents with very few Africans. Despite many investigative studies on racial bias in face biometrics, very few studies on iris biometrics have been published, mainly due to the lack of racially diverse large-scale databases containing sufficient iris samples of Africans in the public domain. Furthermore, most of these databases contain a relatively small number of subjects and labelled images. This paper proposes a large-scale African database named CASIA-Iris-Africa that can be used as a complementary database for the iris recognition community to mediate the effect of racial biases on Africans. The database contains 28,717 images of 1023 African subjects (2046 iris classes) with age, gender, and ethnicity attributes that can be useful in demographically sensitive studies of Africans. Sets of specific application protocols are incorporated with the database to ensure the database's variability and scalability. Performance results of some open-source SOTA algorithms on the database are presented, which will serve as baseline performances. The relatively poor performances of the baseline algorithms on the proposed database despite better performance on other databases prove that racial biases exist in these iris recognition algorithms. The database will be made available on our website: this http URL
[[2302.13004] TBFormer: Two-Branch Transformer for Image Forgery Localization](http://arxiv.org/abs/2302.13004) #extraction
Image forgery localization aims to identify forged regions by capturing subtle traces from high-quality discriminative features. In this paper, we propose a Transformer-style network with two feature extraction branches for image forgery localization, and it is named as Two-Branch Transformer (TBFormer). Firstly, two feature extraction branches are elaborately designed, taking advantage of the discriminative stacked Transformer layers, for both RGB and noise domain features. Secondly, an Attention-aware Hierarchical-feature Fusion Module (AHFM) is proposed to effectively fuse hierarchical features from two different domains. Although the two feature extraction branches have the same architecture, their features have significant differences since they are extracted from different domains. We adopt position attention to embed them into a unified feature domain for hierarchical feature investigation. Finally, a Transformer decoder is constructed for feature reconstruction to generate the predicted mask. Extensive experiments on publicly available datasets demonstrate the effectiveness of the proposed model.
[[2302.13117] Abstractive Text Summarization using Attentive GRU based Encoder-Decoder](http://arxiv.org/abs/2302.13117) #extraction
In todays era huge volume of information exists everywhere. Therefore, it is very crucial to evaluate that information and extract useful, and often summarized, information out of it so that it may be used for relevant purposes. This extraction can be achieved through a crucial technique of artificial intelligence, namely, machine learning. Indeed automatic text summarization has emerged as an important application of machine learning in text processing. In this paper, an english text summarizer has been built with GRU-based encoder and decoder. Bahdanau attention mechanism has been added to overcome the problem of handling long sequences in the input text. A news-summary dataset has been used to train the model. The output is observed to outperform competitive models in the literature. The generated summary can be used as a newspaper headline.
[[2302.12972] A Preliminary Study on Pattern Reconstruction for Optimal Storage of Wearable Sensor Data](http://arxiv.org/abs/2302.12972) #extraction
Efficient querying and retrieval of healthcare data is posing a critical challenge today with numerous connected devices continuously generating petabytes of images, text, and internet of things (IoT) sensor data. One approach to efficiently store the healthcare data is to extract the relevant and representative features and store only those features instead of the continuous streaming data. However, it raises a question as to the amount of information content we can retain from the data and if we can reconstruct the pseudo-original data when needed. By facilitating relevant and representative feature extraction, storage and reconstruction of near original pattern, we aim to address some of the challenges faced by the explosion of the streaming data. We present a preliminary study, where we explored multiple autoencoders for concise feature extraction and reconstruction for human activity recognition (HAR) sensor data. Our Multi-Layer Perceptron (MLP) deep autoencoder achieved a storage reduction of 90.18% compared to the three other implemented autoencoders namely convolutional autoencoder, Long-Short Term Memory (LSTM) autoencoder, and convolutional LSTM autoencoder which achieved storage reductions of 11.18%, 49.99%, and 72.35% respectively. Encoded features from the autoencoders have smaller size and dimensions which help to reduce the storage space. For higher dimensions of the representation, storage reduction was low. But retention of relevant information was high, which was validated by classification performed on the reconstructed data.
[[2302.12862] FLINT: A Platform for Federated Learning Integration](http://arxiv.org/abs/2302.12862) #federate
Cross-device federated learning (FL) has been well-studied from algorithmic, system scalability, and training speed perspectives. Nonetheless, moving from centralized training to cross-device FL for millions or billions of devices presents many risks, including performance loss, developer inertia, poor user experience, and unexpected application failures. In addition, the corresponding infrastructure, development costs, and return on investment are difficult to estimate. In this paper, we present a device-cloud collaborative FL platform that integrates with an existing machine learning platform, providing tools to measure real-world constraints, assess infrastructure capabilities, evaluate model training performance, and estimate system resource requirements to responsibly bring FL into production. We also present a decision workflow that leverages the FL-integrated platform to comprehensively evaluate the trade-offs of cross-device FL and share our empirical evaluations of business-critical machine learning applications that impact hundreds of millions of users.
[[2302.13001] Better Generative Replay for Continual Federated Learning](http://arxiv.org/abs/2302.13001) #federate
Federated learning is a technique that enables a centralized server to learn from distributed clients via communications without accessing the client local data. However, existing federated learning works mainly focus on a single task scenario with static data. In this paper, we introduce the problem of continual federated learning, where clients incrementally learn new tasks and history data cannot be stored due to certain reasons, such as limited storage and data retention policy. Generative replay based methods are effective for continual learning without storing history data, but adapting them for this setting is challenging. By analyzing the behaviors of clients during training, we find that the unstable training process caused by distributed training on non-IID data leads to a notable performance degradation. To address this problem, we propose our FedCIL model with two simple but effective solutions: model consolidation and consistency enforcement. Our experimental results on multiple benchmark datasets demonstrate that our method significantly outperforms baselines.
[[2302.13136] Toward Fairness in Text Generation via Mutual Information Minimization based on Importance Sampling](http://arxiv.org/abs/2302.13136) #fair
Pretrained language models (PLMs), such as GPT2, have achieved remarkable empirical performance in text generation tasks. However, pretrained on large-scale natural language corpora, the generated text from PLMs may exhibit social bias against disadvantaged demographic groups. To improve the fairness of PLMs in text generation, we propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity, i.e., the demographic group to which the sentence is referring. In this way, the mentioning of a demographic group (e.g., male or female) is encouraged to be independent from how it is described in the generated text, thus effectively alleviating the social bias. Moreover, we propose to efficiently estimate the upper bound of the above mutual information via importance sampling, leveraging a natural language corpus. We also propose a distillation mechanism that preserves the language modeling ability of the PLMs after debiasing. Empirical results on real-world benchmarks demonstrate that the proposed method yields superior performance in term of both fairness and language modeling ability.
[[2302.12977] Fair Attribute Completion on Graph with Missing Attributes](http://arxiv.org/abs/2302.12977) #fair
Tackling unfairness in graph learning models is a challenging task, as the unfairness issues on graphs involve both attributes and topological structures. Existing work on fair graph learning simply assumes that attributes of all nodes are available for model training and then makes fair predictions. In practice, however, the attributes of some nodes might not be accessible due to missing data or privacy concerns, which makes fair graph learning even more challenging. In this paper, we propose FairAC, a fair attribute completion method, to complement missing information and learn fair node embeddings for graphs with missing attributes. FairAC adopts an attention mechanism to deal with the attribute missing problem and meanwhile, it mitigates two types of unfairness, i.e., feature unfairness from attributes and topological unfairness due to attribute completion. FairAC can work on various types of homogeneous graphs and generate fair embeddings for them and thus can be applied to most downstream tasks to improve their fairness performance. To our best knowledge, FairAC is the first method that jointly addresses the graph attribution completion and graph unfairness problems. Experimental results on benchmark datasets show that our method achieves better fairness performance with less sacrifice in accuracy, compared with the state-of-the-art methods of fair graph learning.
[[2302.13034] Does Noise Affect Housing Prices? A Case Study in the Urban Area of Thessaloniki](http://arxiv.org/abs/2302.13034) #interpretability
Real estate markets depend on various methods to predict housing prices, including models that have been trained on datasets of residential or commercial properties. Most studies endeavor to create more accurate machine learning models by utilizing data such as basic property characteristics as well as urban features like distances from amenities and road accessibility. Even though environmental factors like noise pollution can potentially affect prices, the research around this topic is limited. One of the reasons is the lack of data. In this paper, we reconstruct and make publicly available a general purpose noise pollution dataset based on published studies conducted by the Hellenic Ministry of Environment and Energy for the city of Thessaloniki, Greece. Then, we train ensemble machine learning models, like XGBoost, on property data for different areas of Thessaloniki to investigate the way noise influences prices through interpretability evaluation techniques. Our study provides a new noise pollution dataset that not only demonstrates the impact noise has on housing prices, but also indicates that the influence of noise on prices significantly varies among different areas of the same city.
[[2302.13083] Knowledge Graph Completion with Counterfactual Augmentation](http://arxiv.org/abs/2302.13083) #interpretability
Graph Neural Networks (GNNs) have demonstrated great success in Knowledge Graph Completion (KGC) by modeling how entities and relations interact in recent years. However, most of them are designed to learn from the observed graph structure, which appears to have imbalanced relation distribution during the training stage. Motivated by the causal relationship among the entities on a knowledge graph, we explore this defect through a counterfactual question: "would the relation still exist if the neighborhood of entities became different from observation?". With a carefully designed instantiation of a causal model on the knowledge graph, we generate the counterfactual relations to answer the question by regarding the representations of entity pair given relation as context, structural information of relation-aware neighborhood as treatment, and validity of the composed triplet as the outcome. Furthermore, we incorporate the created counterfactual relations with the GNN-based framework on KGs to augment their learning of entity pair representations from both the observed and counterfactual relations. Experiments on benchmarks show that our proposed method outperforms existing methods on the task of KGC, achieving new state-of-the-art results. Moreover, we demonstrate that the proposed counterfactual relations-based augmentation also enhances the interpretability of the GNN-based framework through the path interpretations of predictions.
[[2302.13153] Directed Diffusion: Direct Control of Object Placement through Attention Guidance](http://arxiv.org/abs/2302.13153) #diffusion
Text-guided diffusion models such as DALLE-2, IMAGEN, and Stable Diffusion
are able to generate an effectively endless variety of images given only a
short text prompt describing the desired image content. In many cases the
images are very high quality as well. However, these models often struggle to
compose scenes containing several key objects such as characters in specified
positional relationships. Unfortunately, this capability to direct'' the
placement of characters and objects both within and across images is crucial in
storytelling, as recognized in the literature on film and animation theory. In
this work we take a particularly straightforward approach to providing the
needed direction, by injecting
activation'' at desired positions in the
cross-attention maps corresponding to the objects under control, while
attenuating the remainder of the map. The resulting approach is a step toward
generalizing the applicability of text-guided diffusion models beyond single
images to collections of related images, as in storybooks. To the best of our
knowledge, our Directed Diffusion method is the first diffusion technique that
provides positional control over multiple objects, while making use of an
existing pre-trained model and maintaining a coherent blend between the
positioned objects and the background. Moreover, it requires only a few lines
to implement.
[[2302.12881] Denoising diffusion algorithm for inverse design of microstructures with fine-tuned nonlinear material properties](http://arxiv.org/abs/2302.12881) #diffusion
In this paper, we introduce a denoising diffusion algorithm to discover microstructures with nonlinear fine-tuned properties. Denoising diffusion probabilistic models are generative models that use diffusion-based dynamics to gradually denoise images and generate realistic synthetic samples. By learning the reverse of a Markov diffusion process, we design an artificial intelligence to efficiently manipulate the topology of microstructures to generate a massive number of prototypes that exhibit constitutive responses sufficiently close to designated nonlinear constitutive responses. To identify the subset of microstructures with sufficiently precise fine-tuned properties, a convolutional neural network surrogate is trained to replace high-fidelity finite element simulations to filter out prototypes outside the admissible range. The results of this study indicate that the denoising diffusion process is capable of creating microstructures of fine-tuned nonlinear material properties within the latent space of the training data. More importantly, the resulting algorithm can be easily extended to incorporate additional topological and geometric modifications by introducing high-dimensional structures embedded in the latent space. The algorithm is tested on the open-source mechanical MNIST data set. Consequently, this algorithm is not only capable of performing inverse design of nonlinear effective media but also learns the nonlinear structure-property map to quantitatively understand the multiscale interplay among the geometry and topology and their effective macroscopic properties.