[[2208.06165] Customer Empowered Privacy-Preserving Secure Verification using Decentralized Identifier and Verifiable Credentials For Product Delivery Using Robots](http://arxiv.org/abs/2208.06165)
In the age of respiratory illnesses like COVID 19, we understand the necessity for a robot based delivery system to ensure safe and contact free courier delivery. A blockchain based Dynamic IDentifier gives people total power over their identities while preserving auditability and anonymity. A human mobile phone and a robot are machines created with a chip, making it simple to deploy a physical unclonable function based verification system between the robot and the customer. This article presents a novel framework and a first customer verification scheme for verified courier delivery utilizing the blockchain enabled DID and PUF enabled robots. We employ DID for customer authentication between a robot (a service provider) and a customer and PUF for robot verification by the customer. We ve also put the proposed work into practice and demonstrated its capabilities in terms of throughput, latency, computing cost, and communication cost. We also show formal security proof for the proposed user verification scheme based on the tamarin prover.
[[2208.06223] Perfectly Secure Synchronous MPC with Asynchronous Fallback Guarantees Against General Adversaries](http://arxiv.org/abs/2208.06223)
In this work, we study perfectly-secure multi-party computation (MPC) against general (non-threshold) adversaries. Known protocols in a synchronous network are secure against $Q^{(3)}$ adversary structures, while in an asynchronous network, known protocols are secure against $Q^{(4)}$ adversary structures. A natural question is whether there exists a single protocol which remains secure against $Q^{(3)}$ and $Q^{(4)}$ adversary structures in a synchronous and in an asynchronous network respectively, where the parties are not aware of the network type. We design the first such best-of-both-worlds protocol against general adversaries. Our result generalizes the result of Appan, Chandramouli and Choudhury (PODC 2022), which presents a best-of-both-worlds perfectly-secure protocol against threshold adversaries.
To design our protocol, we present two important building blocks which are of independent interest. The first building block is a best-of-both-worlds perfectly-secure Byzantine agreement (BA) protocol for $Q^{(3)}$ adversary structures, which remains secure both in a synchronous, as well as an asynchronous network. The second building block is a best-of-both-worlds perfectly-secure verifiable secret-sharing (VSS) protocol, which remains secure against $Q^{(3)}$ and $Q^{(4)}$ adversary structures in a synchronous network and an asynchronous network respectively.
[[2208.06231] Mutual authentication in self-organized VANETs](http://arxiv.org/abs/2208.06231)
The practical deployment of vehicular networks is still a pending issue. In this paper we describe a new self-organized method of authentication for VANETs, which allows their widespread, fast and secure implementation. Our proposal does not involve any central certification authority because the nodes themselves certify the validity of public keys of the other nodes. On the one hand we propose an algorithm that each node must use to choose the public key certificates for its local store. On the other hand, we also describe a new node authentication method based on a cryptographic protocol including a zero-knowledge proof that each node must use to convince another node on the possession of certain secret without revealing anything about it, which allows non-encrypted communication during authentication. Thanks to the combination of the aforementioned tools, the cooperation among vehicles can be used for developing several practical applications of VANETs, such as detection and warning about abnormal traffic conditions. One of the most interesting aspects of our proposal is that it only requires existing devices such as smartphones, because the designed schemes are fully distributed and self-organized. In this work we include an analysis of both an NS-2 simulation and a real device implementation of the proposed algorithms, which enables us to extract promising conclusions and several possible improvements and open questions for further research.
[[2208.06003] Security of IoT Device: Perspective Forensic/Anti-Forensic Issues on Invalid Area of NAND Flash Memory](http://arxiv.org/abs/2208.06003)
NAND flash memory-based IoT device can potentially still leave behind original personal data in an invalid area even if the data has been deleted. In this paper, we raise the forensic issue of original data remaining in unmanaged blocks caused by NAND flash memory and introduce methods for secure deletion of such data in the invalid area. We also propose a verification technique for secure deletion that is performed based on cell count information, which refers to the difference in bits between personal data and data stored in the block. The pass/fail of the verification technique according to the cell count information is determined in consideration of error correction capabilities. With the forensic issue of de-identification being a vital theme in the big data industry, the threat of serious privacy breaches coupled with our proposal to prevent these attacks will prove to be critical technological necessities in the future.
[[2208.06075] Testing SOAR Tools in Use](http://arxiv.org/abs/2208.06075)
Modern security operation centers (SOCs) rely on operators and a tapestry of logging and alerting tools with large scale collection and query abilities. SOC investigations are tedious as they rely on manual efforts to query diverse data sources, overlay related logs, and correlate the data into information and then document results in a ticketing system. Security orchestration, automation, and response (SOAR) tools are a new technology that promise to collect, filter, and display needed data; automate common tasks that require SOC analysts' time; facilitate SOC collaboration; and, improve both efficiency and consistency of SOCs. SOAR tools have never been tested in practice to evaluate their effect and understand them in use. In this paper, we design and administer the first hands-on user study of SOAR tools, involving 24 participants and 6 commercial SOAR tools. Our contributions include the experimental design, itemizing six characteristics of SOAR tools and a methodology for testing them. We describe configuration of the test environment in a cyber range, including network, user, and threat emulation; a full SOC tool suite; and creation of artifacts allowing multiple representative investigation scenarios to permit testing. We present the first research results on SOAR tools. We found that SOAR configuration is critical, as it involves creative design for data display and automation. We found that SOAR tools increased efficiency and reduced context switching during investigations, although ticket accuracy and completeness (indicating investigation quality) decreased with SOAR use. Our findings indicated that user preferences are slightly negatively correlated with their performance with the tool; overautomation was a concern of senior analysts, and SOAR tools that balanced automation with assisting a user to make decisions were preferred.
[[2208.06136] How far are German companies in improving security through static program analysis tools?](http://arxiv.org/abs/2208.06136)
As security becomes more relevant for many companies, the popularity of static program analysis (SPA) tools is increasing. In this paper, we target the use of SPA tools among companies in Germany with a focus on security. We give insights on the current issues and the developers' willingness to configure the tools to overcome these issues. Compared to previous studies, our study considers the companies' culture and processes for using SPA tools. We conducted an online survey with 256 responses and semi-structured interviews with 17 product owners and executives from multiple companies. Our results show a diversity in the usage of tools. Only half of our survey participants use SPA tools. The free tools tend to be more popular among software developers. In most companies, software developers are encouraged to use free tools, whereas commercial tools can be requested. However, the product owners and executives in our interviews reported that their developers do not request new tools. We also find out that automatic security checks with tools are rarely performed on each release.
[[2208.06147] Software implementation of the SNOW 3G Generator on iOS and Android platforms](http://arxiv.org/abs/2208.06147)
The standard for wireless communication of high-speed data in mobile phones and data terminals, called LTE (Long-Term Evolution) and marketed as 4G/LTE, is quickly being adopted worldwide. The security of this type of communication is a crucial factor mainly due to its mobile and wireless nature. This work includes a practical analysis of the SNOW 3G generator used to protect the confidentiality and integrity in LTE communications. In particular, several techniques to perform multiplications and LFSR operations have been studied and implemented on both iOS and Android platforms. The evaluation of those implementations led to some conclusions that could be used to improve the efficiency of future implementations of the standard.
[[2208.06153] How to build vehicular ad-hoc networks on smartphones](http://arxiv.org/abs/2208.06153)
Vehicular ad-hoc networks have been defined in the literature as communications networks that allow disseminating information among vehicles to help to reduce traffic accidents and congestions. The practical deployment of such networks has been delayed mainly due to economic and technical issues. This paper describes a new software application to detect traffic incidents and exchange information about them, using only smartphones, without any central authority or additional equipment. Both road safety and communication security have been taken into account in the application design. On the one hand, the interface has been designed to avoid distractions while driving because it operates automatically and independently of the driver, through voice prompts. On the other hand, communication security, which is essential in critical wireless networks, is provided through the protection of attributes such as authenticity, privacy, integrity and non-repudiation. All this is achieved without increasing the price of vehicles and without requiring the integration of new devices neither in vehicles nor on roads. The only prerequisite is to have a smartphone equipped with Wi-Fi connectivity and GPS location in each vehicle. The proposed application has been successfully validated both in large-scale NS-2 simulations and in small-scale real tests to detect traffic congestions and empty parking spaces.
[[2208.06405] Collective Obfuscation and Crowdsourcing](http://arxiv.org/abs/2208.06405)
Crowdsourcing technologies rely on groups of people to input information that may be critical for decision-making. This work examines obfuscation in the context of reporting technologies. We show that widespread use of reporting platforms comes with unique security and privacy implications, and introduce a threat model and corresponding taxonomy to outline some of the many attack vectors in this space. We then perform an empirical analysis of a dataset of call logs from a controversial, real-world reporting hotline and identify coordinated obfuscation strategies that are intended to hinder the platform's legitimacy. We propose a variety of statistical measures to quantify the strength of this obfuscation strategy with respect to the structural and semantic characteristics of the reporting attacks in our dataset.
[[2208.06216] Is Your Model Sensitive? SPeDaC: A New Benchmark for Detecting and Classifying Sensitive Personal Data](http://arxiv.org/abs/2208.06216)
In recent years we have seen the exponential growth of applications, including dialogue systems, that handle sensitive personal information. This has brought to light the extremely important issue regarding personal data protection in virtual environments. Firstly, a performing model should be able to distinguish sentences with sensitive content from neutral sentences. Secondly, it should be able to identify the type of personal data category contained in them. In this way, a different privacy treatment could be considered for each category. In literature, if there are works on automatic sensitive data identification, these are often conducted on different domains or languages without a common benchmark. To fill this gap, in this work we introduce SPeDaC, a new annotated benchmark for the identification of sensitive personal data categories. Furthermore, we provide an extensive evaluation of our dataset, conducted using different baselines and a classifier based on RoBERTa, a neural architecture that achieves strong performances on the detection of sensitive sentences and on the personal data categories classification.
[[2208.06093] Scalable and Sparsity-Aware Privacy-Preserving K-means Clustering with Application to Fraud Detection](http://arxiv.org/abs/2208.06093)
K-means is one of the most widely used clustering models in practice. Due to the problem of data isolation and the requirement for high model performance, how to jointly build practical and secure K-means for multiple parties has become an important topic for many applications in the industry. Existing work on this is mainly of two types. The first type has efficiency advantages, but information leakage raises potential privacy risks. The second type is provable secure but is inefficient and even helpless for the large-scale data sparsity scenario. In this paper, we propose a new framework for efficient sparsity-aware K-means with three characteristics. First, our framework is divided into a data-independent offline phase and a much faster online phase, and the offline phase allows to pre-compute almost all cryptographic operations. Second, we take advantage of the vectorization techniques in both online and offline phases. Third, we adopt a sparse matrix multiplication for the data sparsity scenario to improve efficiency further. We conduct comprehensive experiments on three synthetic datasets and deploy our model in a real-world fraud detection task. Our experimental results show that, compared with the state-of-the-art solution, our model achieves competitive performance in terms of both running time and communication size, especially on sparse datasets.
[[2208.06135] Private Domain Adaptation from a Public Source](http://arxiv.org/abs/2208.06135)
A key problem in a variety of applications is that of domain adaptation from a public source domain, for which a relatively large amount of labeled data with no privacy constraints is at one's disposal, to a private target domain, for which a private sample is available with very few or no labeled data. In regression problems with no privacy constraints on the source or target data, a discrepancy minimization algorithm based on several theoretical guarantees was shown to outperform a number of other adaptation algorithm baselines. Building on that approach, we design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data. The design and analysis of our private algorithms critically hinge upon several key properties we prove for a smooth approximation of the weighted discrepancy, such as its smoothness with respect to the $\ell_1$-norm and the sensitivity of its gradient. Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms. We show that our adaptation algorithms benefit from strong generalization and privacy guarantees and report the results of experiments demonstrating their effectiveness.
[[2208.06163] Dropout is NOT All You Need to Prevent Gradient Leakage](http://arxiv.org/abs/2208.06163)
Gradient inversion attacks on federated learning systems reconstruct client training data from exchanged gradient information. To defend against such attacks, a variety of defense mechanisms were proposed. However, they usually lead to an unacceptable trade-off between privacy and model utility. Recent observations suggest that dropout could mitigate gradient leakage and improve model utility if added to neural networks. Unfortunately, this phenomenon has not been systematically researched yet. In this work, we thoroughly analyze the effect of dropout on iterative gradient inversion attacks. We find that state of the art attacks are not able to reconstruct the client data due to the stochasticity induced by dropout during model training. Nonetheless, we argue that dropout does not offer reliable protection if the dropout induced stochasticity is adequately modeled during attack optimization. Consequently, we propose a novel Dropout Inversion Attack (DIA) that jointly optimizes for client data and dropout masks to approximate the stochastic client model. We conduct an extensive systematic evaluation of our attack on four seminal model architectures and three image classification datasets of increasing complexity. We find that our proposed attack bypasses the protection seemingly induced by dropout and reconstructs client data with high fidelity. Our work demonstrates that privacy inducing changes to model architectures alone cannot be assumed to reliably protect from gradient leakage and therefore should be combined with complementary defense mechanisms.
[[2208.05969] Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment](http://arxiv.org/abs/2208.05969)
The size of deep learning models in artificial intelligence (AI) software is increasing rapidly, which hinders the large-scale deployment on resource-restricted devices (e.g., smartphones). To mitigate this issue, AI software compression plays a crucial role, which aims to compress model size while keeping high performance. However, the intrinsic defects in the big model may be inherited by the compressed one. Such defects may be easily leveraged by attackers, since the compressed models are usually deployed in a large number of devices without adequate protection. In this paper, we try to address the safe model compression problem from a safety-performance co-optimization perspective. Specifically, inspired by the test-driven development (TDD) paradigm in software engineering, we propose a test-driven sparse training framework called SafeCompress. By simulating the attack mechanism as the safety test, SafeCompress can automatically compress a big model to a small one following the dynamic sparse training paradigm. Further, considering a representative attack, i.e., membership inference attack (MIA), we develop a concrete safe model compression mechanism, called MIA-SafeCompress. Extensive experiments are conducted to evaluate MIA-SafeCompress on five datasets for both computer vision and natural language processing tasks. The results verify the effectiveness and generalization of our method. We also discuss how to adapt SafeCompress to other attacks besides MIA, demonstrating the flexibility of SafeCompress.
[[2208.06222] Scale-free Photo-realistic Adversarial Pattern Attack](http://arxiv.org/abs/2208.06222)
Traditional pixel-wise image attack algorithms suffer from poor robustness to defense algorithms, i.e., the attack strength degrades dramatically when defense algorithms are applied. Although Generative Adversarial Networks (GAN) can partially address this problem by synthesizing a more semantically meaningful texture pattern, the main limitation is that existing generators can only generate images of a specific scale. In this paper, we propose a scale-free generation-based attack algorithm that synthesizes semantically meaningful adversarial patterns globally to images with arbitrary scales. Our generative attack approach consistently outperforms the state-of-the-art methods on a wide range of attack settings, i.e. the proposed approach largely degraded the performance of various image classification, object detection, and instance segmentation algorithms under different advanced defense methods.
[[2208.06092] On deceiving malware classification with section injection](http://arxiv.org/abs/2208.06092)
We investigate how to modify executable files to deceive malware classification systems. This work's main contribution is a methodology to inject bytes across a malware file randomly and use it both as an attack to decrease classification accuracy but also as a defensive method, augmenting the data available for training. It respects the operating system file format to make sure the malware will still execute after our injection and will not change its behavior. We reproduced five state-of-the-art malware classification approaches to evaluate our injection scheme: one based on GIST+KNN, three CNN variations and one Gated CNN. We performed our experiments on a public dataset with 9,339 malware samples from 25 different families. Our results show that a mere increase of 7% in the malware size causes an accuracy drop between 25% and 40% for malware family classification. They show that a automatic malware classification system may not be as trustworthy as initially reported in the literature. We also evaluate using modified malwares alongside the original ones to increase networks robustness against mentioned attacks. Results show that a combination of reordering malware sections and injecting random data can improve overall performance of the classification. Code available at https://github.com/adeilsonsilva/malware-injection.
[[2208.06130] Analysis, Detection, and Classification of Android Malware using System Calls](http://arxiv.org/abs/2208.06130)
With the increasing popularity of Android in the last decade, Android is popular among users as well as attackers. The vast number of android users grabs the attention of attackers on android. Due to the continuous evolution of the variety and attacking techniques of android malware, our detection methods should need an update too. Most of the researcher's works are based on static features, and very few focus on dynamic features. In this paper, we are filling the literature gap by detecting android malware using System calls. We are running the malicious app in a monitored and controlled environment using an emulator to detect malware. Malicious behavior is activated with some simulated events during its runtime to activate its hostile behavior. Logs collected during the app's runtime are analyzed and fed to different machine learning models for Detection and Family classification of Malware. The result indicates that K-Nearest Neighbor and the Decision Tree gave the highest accuracy in malware detection and Family Classification respectively.
[[2208.06176] A Knowledge Distillation-Based Backdoor Attack in Federated Learning](http://arxiv.org/abs/2208.06176)
Federated Learning (FL) is a novel framework of decentralized machine learning. Due to the decentralized feature of FL, it is vulnerable to adversarial attacks in the training procedure, e.g. , backdoor attacks. A backdoor attack aims to inject a backdoor into the machine learning model such that the model will make arbitrarily incorrect behavior on the test sample with some specific backdoor trigger. Even though a range of backdoor attack methods of FL has been introduced, there are also methods defending against them. Many of the defending methods utilize the abnormal characteristics of the models with backdoor or the difference between the models with backdoor and the regular models. To bypass these defenses, we need to reduce the difference and the abnormal characteristics. We find a source of such abnormality is that backdoor attack would directly flip the label of data when poisoning the data. However, current studies of the backdoor attack in FL are not mainly focus on reducing the difference between the models with backdoor and the regular models. In this paper, we propose Adversarial Knowledge Distillation(ADVKD), a method combine knowledge distillation with backdoor attack in FL. With knowledge distillation, we can reduce the abnormal characteristics in model result from the label flipping, thus the model can bypass the defenses. Compared to current methods, we show that ADVKD can not only reach a higher attack success rate, but also successfully bypass the defenses when other methods fails. To further explore the performance of ADVKD, we test how the parameters affect the performance of ADVKD under different scenarios. According to the experiment result, we summarize how to adjust the parameter for better performance under different scenarios. We also use several methods to visualize the effect of different attack and explain the effectiveness of ADVKD.
[[2208.06195] Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation](http://arxiv.org/abs/2208.06195)
Pose estimation is usually tackled as either a bin classification problem or as a regression problem. In both cases, the idea is to directly predict the pose of an object. This is a non-trivial task because of appearance variations of similar poses and similarities between different poses. Instead, we follow the key idea that it is easier to compare two poses than to estimate them. Render-and-compare approaches have been employed to that end, however, they tend to be unstable, computationally expensive, and slow for real-time applications. We propose doing category-level pose estimation by learning an alignment metric using a contrastive loss with a dynamic margin and a continuous pose-label space. For efficient inference, we use a simple real-time image retrieval scheme with a reference set of renderings projected to an embedding space. To achieve robustness to real-world conditions, we employ synthetic occlusions, bounding box perturbations, and appearance augmentations. Our approach achieves state-of-the-art performance on PASCAL3D and OccludedPASCAL3D, as well as high-quality results on KITTI3D.
[[2208.06359] A Case for Rejection in Low Resource ML Deployment](http://arxiv.org/abs/2208.06359)
Building reliable AI decision support systems requires a robust set of data on which to train models; both with respect to quantity and diversity. Obtaining such datasets can be difficult in resource limited settings, or for applications in early stages of deployment. Sample rejection is one way to work around this challenge, however much of the existing work in this area is ill-suited for such scenarios. This paper substantiates that position and proposes a simple solution as a proof of concept baseline.
[[2208.06061] Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages](http://arxiv.org/abs/2208.06061)
Machine translation has seen rapid progress with the advent of Transformer-based models. These models have no explicit linguistic structure built into them, yet they may still implicitly learn structured relationships by attending to relevant tokens. We hypothesize that this structural learning could be made more robust by explicitly endowing Transformers with a structural bias, and we investigate two methods for building in such a bias. One method, the TP-Transformer, augments the traditional Transformer architecture to include an additional component to represent structure. The second method imbues structure at the data level by segmenting the data with morphological tokenization. We test these methods on translating from English into morphologically rich languages, Turkish and Inuktitut, and consider both automatic metrics and human evaluations. We find that each of these two approaches allows the network to achieve better performance, but this improvement is dependent on the size of the dataset. In sum, structural encoding methods make Transformers more sample-efficient, enabling them to perform better from smaller amounts of data.
[[2208.06031] Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach](http://arxiv.org/abs/2208.06031)
Due to the characteristics of Information and Communications Technology (ICT) products, the critical information of ICT devices is often summarized in big tabular data shared across supply chains. Therefore, it is critical to automatically interpret tabular structures with the surging amount of electronic assets. To transform the tabular data in electronic documents into a machine-interpretable format and provide layout and semantic information for information extraction and interpretation, we define a Table Structure Recognition (TSR) task and a Table Cell Type Classification (CTC) task. We use a graph to represent complex table structures for the TSR task. Meanwhile, table cells are categorized into three groups based on their functional roles for the CTC task, namely Header, Attribute, and Data. Subsequently, we propose a multi-task model to solve the defined two tasks simultaneously by using the text modal and image modal features. Our experimental results show that our proposed method can outperform state-of-the-art methods on ICDAR2013 and UNLV datasets.
[[2208.06143] PRIF: Primary Ray-based Implicit Function](http://arxiv.org/abs/2208.06143)
We introduce a new implicit shape representation called Primary Ray-based Implicit Function (PRIF). In contrast to most existing approaches based on the signed distance function (SDF) which handles spatial locations, our representation operates on oriented rays. Specifically, PRIF is formulated to directly produce the surface hit point of a given input ray, without the expensive sphere-tracing operations, hence enabling efficient shape extraction and differentiable rendering. We demonstrate that neural networks trained to encode PRIF achieve successes in various tasks including single shape representation, category-wise shape generation, shape completion from sparse or noisy observations, inverse rendering for camera pose estimation, and neural rendering with color.
[[2208.06179] Exploiting Feature Diversity for Make-up Temporal Video Grounding](http://arxiv.org/abs/2208.06179)
This technical report presents the 3rd winning solution for MTVG, a new task introduced in the 4-th Person in Context (PIC) Challenge at ACM MM 2022. MTVG aims at localizing the temporal boundary of the step in an untrimmed video based on a textual description. The biggest challenge of this task is the fi ne-grained video-text semantics of make-up steps. However, current methods mainly extract video features using action-based pre-trained models. As actions are more coarse-grained than make-up steps, action-based features are not sufficient to provide fi ne-grained cues. To address this issue,we propose to achieve fi ne-grained representation via exploiting feature diversities. Specifically, we proposed a series of methods from feature extraction, network optimization, to model ensemble. As a result, we achieved 3rd place in the MTVG competition.
[[2208.06283] Semantic decomposition Network with Contrastive and Structural Constraints for Dental Plaque Segmentation](http://arxiv.org/abs/2208.06283)
Segmenting dental plaque from images of medical reagent staining provides valuable information for diagnosis and the determination of follow-up treatment plan. However, accurate dental plaque segmentation is a challenging task that requires identifying teeth and dental plaque subjected to semantic-blur regions (i.e., confused boundaries in border regions between teeth and dental plaque) and complex variations of instance shapes, which are not fully addressed by existing methods. Therefore, we propose a semantic decomposition network (SDNet) that introduces two single-task branches to separately address the segmentation of teeth and dental plaque and designs additional constraints to learn category-specific features for each branch, thus facilitating the semantic decomposition and improving the performance of dental plaque segmentation. Specifically, SDNet learns two separate segmentation branches for teeth and dental plaque in a divide-and-conquer manner to decouple the entangled relation between them. Each branch that specifies a category tends to yield accurate segmentation. To help these two branches better focus on category-specific features, two constraint modules are further proposed: 1) contrastive constraint module (CCM) to learn discriminative feature representations by maximizing the distance between different category representations, so as to reduce the negative impact of semantic-blur regions on feature extraction; 2) structural constraint module (SCM) to provide complete structural information for dental plaque of various shapes by the supervision of an boundary-aware geometric constraint. Besides, we construct a large-scale open-source Stained Dental Plaque Segmentation dataset (SDPSeg), which provides high-quality annotations for teeth and dental plaque. Experimental results on SDPSeg datasets show SDNet achieves state-of-the-art performance.
[[2208.06040] Figure Descriptive Text Extraction using Ontological Representation](http://arxiv.org/abs/2208.06040)
Experimental research publications provide figure form resources including graphs, charts, and any type of images to effectively support and convey methods and results. To describe figures, authors add captions, which are often incomplete, and more descriptions reside in body text. This work presents a method to extract figure descriptive text from the body of scientific articles. We adopted ontological semantics to aid concept recognition of figure-related information, which generates human- and machine-readable knowledge representations from sentences. Our results show that conceptual models bring an improvement in figure descriptive sentence classification over word-based approaches.
[[2208.06095] A Fast Blockchain-based Federated Learning Framework with Compressed Communications](http://arxiv.org/abs/2208.06095)
Recently, blockchain-based federated learning (BFL) has attracted intensive research attention due to that the training process is auditable and the architecture is serverless avoiding the single point failure of the parameter server in vanilla federated learning (VFL). Nevertheless, BFL tremendously escalates the communication traffic volume because all local model updates (i.e., changes of model parameters) obtained by BFL clients will be transmitted to all miners for verification and to all clients for aggregation. In contrast, the parameter server and clients in VFL only retain aggregated model updates. Consequently, the huge communication traffic in BFL will inevitably impair the training efficiency and hinder the deployment of BFL in reality. To improve the practicality of BFL, we are among the first to propose a fast blockchain-based communication-efficient federated learning framework by compressing communications in BFL, called BCFL. Meanwhile, we derive the convergence rate of BCFL with non-convex loss. To maximize the final model accuracy, we further formulate the problem to minimize the training loss of the convergence rate subject to a limited training time with respect to the compression rate and the block generation rate, which is a bi-convex optimization problem and can be efficiently solved. To the end, to demonstrate the efficiency of BCFL, we carry out extensive experiments with standard CIFAR-10 and FEMNIST datasets. Our experimental results not only verify the correctness of our analysis, but also manifest that BCFL can remarkably reduce the communication traffic by 95-98% or shorten the training time by 90-95% compared with BFL.
[[2208.06192] Personalizing or Not: Dynamically Personalized Federated Learning with Incentives](http://arxiv.org/abs/2208.06192)
Personalized federated learning (FL) facilitates collaborations between multiple clients to learn personalized models without sharing private data. The mechanism mitigates the statistical heterogeneity commonly encountered in the system, i.e., non-IID data over different clients. Existing personalized algorithms generally assume all clients volunteer for personalization. However, potential participants might still be reluctant to personalize models since they might not work well. In this case, clients choose to use the global model instead. To avoid making unrealistic assumptions, we introduce the personalization rate, measured as the fraction of clients willing to train personalized models, into federated settings and propose DyPFL. This dynamically personalized FL technique incentivizes clients to participate in personalizing local models while allowing the adoption of the global model when it performs better. We show that the algorithmic pipeline in DyPFL guarantees good convergence performance, allowing it to outperform alternative personalized methods in a broad range of conditions, including variation in heterogeneity, number of clients, local epochs, and batch sizes.
[[2208.06308] Developing a Philosophical Framework for Fair Machine Learning: The Case of Algorithmic Collusion and Market Fairness](http://arxiv.org/abs/2208.06308)
Fair machine learning research has been primarily concerned with classification tasks that result in discrimination. As machine learning algorithms are applied in new contexts, however, the harms or injustices that result are qualitatively different than those presently studied. Existing research at the level of metrics and definitions cannot measure these qualitatively different types of injustice. One example of this is the problem of market fairness and algorithmic collusion. Negative consequences of algorithmic collusion affect all consumers, not only particular members of a protected class. Drawing on this case study, I develop an ethical framework for fair machine learning research in new domains. This contribution ties the development of fairness metrics to specifically scoped normative principles. This enables fairness metrics to reflect different concerns from discrimination. I develop this framework and provide the philosophical rationale for its structure, ultimately applying it to the case of algorithmic collusion. I conclude with limitations of my proposal and discuss promising avenues of future research.
[[2208.06140] Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis](http://arxiv.org/abs/2208.06140)
Universal style transfer (UST) infuses styles from arbitrary reference images into content images. Existing methods, while enjoying many practical successes, are unable of explaining experimental observations, including different performances of UST algorithms in preserving the spatial structure of content images. In addition, methods are limited to cumbersome global controls on stylization, so that they require additional spatial masks for desired stylization. In this work, we provide a systematic Fourier analysis on a general framework for UST. We present an equivalent form of the framework in the frequency domain. The form implies that existing algorithms treat all frequency components and pixels of feature maps equally, except for the zero-frequency component. We connect Fourier amplitude and phase with Gram matrices and a content reconstruction loss in style transfer, respectively. Based on such equivalence and connections, we can thus interpret different structure preservation behaviors between algorithms with Fourier phase. Given the interpretations we have, we propose two manipulations in practice for structure preservation and desired stylization. Both qualitative and quantitative experiments demonstrate the competitive performance of our method against the state-of-the-art methods. We also conduct experiments to demonstrate (1) the abovementioned equivalence, (2) the interpretability based on Fourier amplitude and phase and (3) the controllability associated with frequency components.