Showing search results
We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! Supplementary ZIP Download PDF Related Material DownloadCopy to Clipboard 139:803-812 Available from https://proceedings.mlr.press/v139/bertail21a.html.Proceedings of Machine Learning Research, in Proceedings of the 38th International Conference on Machine Learning Bertail, P., Clémençon, S., Guyonvarch, Y. & Noiry, N.. (2021). Learning from Biased Data: A Semi-Parametric Approach. APA DownloadCopy to Clipboard %0 Conference Paper %T Learning from Biased Data: A Semi-Parametric Approach %A Patrice Bertail %A Stephan Clémençon %A Yannick Guyonvarch %A Nathan Noiry %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-bertail21a %I PMLR %P 803--812 %U https://proceedings.mlr.press/v139/bertail21a.html %V 139 %X We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! Endnote DownloadCopy to Clipboard @InProceedings{pmlr-v139-bertail21a, title = {Learning from Biased Data: A Semi-Parametric Approach}, author = {Bertail, Patrice and Cl{\'e}men{\c{c}}on, Stephan and Guyonvarch, Yannick and Noiry, Nathan}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {803--812}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher ={PMLR}, pdf = {http://proceedings.mlr.press/v139/bertail21a/bertail21a.pdf}, url = {https://proceedings.mlr.press/v139/bertail21a.html}, abstract = {We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! BibTeX Cite this Paper
We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! Supplementary ZIP Download PDF Related Material DownloadCopy to Clipboard 139:803-812 Available from https://proceedings.mlr.press/v139/bertail21a.html.Proceedings of Machine Learning Research, in Proceedings of the 38th International Conference on Machine Learning Bertail, P., Clémençon, S., Guyonvarch, Y. & Noiry, N.. (2021). Learning from Biased Data: A Semi-Parametric Approach. APA DownloadCopy to Clipboard %0 Conference Paper %T Learning from Biased Data: A Semi-Parametric Approach %A Patrice Bertail %A Stephan Clémençon %A Yannick Guyonvarch %A Nathan Noiry %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-bertail21a %I PMLR %P 803--812 %U https://proceedings.mlr.press/v139/bertail21a.html %V 139 %X We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! Endnote DownloadCopy to Clipboard @InProceedings{pmlr-v139-bertail21a, title = {Learning from Biased Data: A Semi-Parametric Approach}, author = {Bertail, Patrice and Cl{\'e}men{\c{c}}on, Stephan and Guyonvarch, Yannick and Noiry, Nathan}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {803--812}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher ={PMLR}, pdf = {http://proceedings.mlr.press/v139/bertail21a/bertail21a.pdf}, url = {https://proceedings.mlr.press/v139/bertail21a.html}, abstract = {We consider risk minimization problems where the (source) distribution $P_S$ of the training observations $Z_1, \ldots, Z_n$ differs from the (target) distribution $P_T$ involved in the risk that one seeks to minimize. Under the natural assumption that $P_S$ dominates $P_T$, \textit{i.e.} $P_T< \! \! BibTeX Cite this Paper
In spite of the high performance and reliability of deep learning algorithms in a wide range of everyday applications, many investigations tend to show that a lot of models exhibit biases, discriminating against specific subgroups of the population (e.g. gender, ethnicity). This urges the practitioner to develop fair systems with a uniform/comparable performance across sensitive groups. In this work, we investigate the gender bias of deep Face Recognition networks. In order to measure this bias, we introduce two new metrics, BFAR and BFRR, that better reflect the inherent deployment needs of Face Recognition systems. Motivated by geometric considerations, we mitigate gender bias through a new post-processing methodology which transforms the deep embeddings of a pre-trained model to give more representation power to discriminated subgroups. It consists in training a shallow neural network by minimizing a Fair von Mises-Fisher loss whose hyperparameters account for the intra-class variance of each gender. Interestingly, we empirically observe that these hyperparameters are correlated with our fairness metrics. In fact, extensive numerical experiments on a variety of datasets show that a careful selection significantly reduces gender bias.
Motivated by sequential budgeted allocation problems, we investigateonline matching problems where connections between vertices are not i.i.d., but they have fixed degree distributions -- the so-called configuration model. We estimate the competitive ratio of the simplest algorithm, GREEDY, by approximating some relevant stochastic discrete processes by their continuous counterparts, that are solutions of an explicit system of partial differential equations. This technique gives precise bounds on the estimationerrors,with arbitrarily high probability as the problem size increases. In particular, it allows the formal comparison between different configuration models. We also prove that, quite surprisingly,GREEDY can havebetter performance guarantees than RANKING, another celebrated algorithm for online matching that usually outperforms the former.
In spite of the high performance and reliability of deep learning algorithms in a wide range of everyday applications, many investigations tend to show that a lot of models exhibit biases, discriminating against specific subgroups of the population (e.g. gender, ethnicity). This urges the practitioner to develop fair systems with a uniform/comparable performance across sensitive groups. In this work, we investigate the gender bias of deep Face Recognition networks. In order to measure this bias, we introduce two new metrics, BFAR and BFRR, that better reflect the inherent deployment needs of Face Recognition systems. Motivated by geometric considerations, we mitigate gender bias through a new post-processing methodology which transforms the deep embeddings of a pre-trained model to give more representation power to discriminated subgroups. It consists in training a shallow neural network by minimizing a Fair von Mises-Fisher loss whose hyperparameters account for the intra-class variance of each gender. Interestingly, we empirically observe that these hyperparameters are correlated with our fairness metrics. In fact, extensive numerical experiments on a variety of datasets show that a careful selection significantly reduces gender bias.
We study representation learning for Offline Reinforcement Learning (RL), focusing on the important task of Offline Policy Evaluation (OPE). Recent work shows that, in contrast to supervised learning, realizability of the Q-function is not enough for learning it. Two sufficient conditions for sample-efficient OPE are Bellman completeness and coverage. Prior work often assumes that representations satisfying these conditions are given, with results being mostly theoretical in nature. In this work, we propose BCRL, which directly learns from data an approximately linear Bellman complete representation with good coverage. With this learned representation, we perform OPE using Least Square Policy Evaluation (LSPE) with linear functions in our learned representation. We present an end-to-end theoretical analysis, showing that our two-stage algorithm enjoys polynomial sample complexity provided some representation in the rich class considered is linear Bellman complete. Empirically, we extensively evaluate our algorithm on challenging, image-based continuous control tasks from the Deepmind Control Suite. We show our representation enables better OPE compared to previous representation learning methods developed for off-policy RL (e.g., CURL, SPR). BCRL achieve competitive OPE error with the state-of-the-art method Fitted Q-Evaluation (FQE), and beats FQE when evaluating beyond the initial state distribution. Our ablations show that both linear Bellman complete and coverage components of our method are crucial.
Off-policy evaluation and learning (OPE/L) use offline observational data to make better decisions, which is crucial in applications where online experimentation is limited. However, depending entirely on logged data, OPE/L is sensitive to environment distribution shifts — discrepancies between the data-generating environment and that where policies are deployed. Si et al., (2020) proposed distributionally robust OPE/L (DROPE/L) to address this, but the proposal relies on inverse-propensity weighting, whose estimation error and regret will deteriorate if propensities are nonparametrically estimated and whose variance is suboptimal even if not. For standard, non-robust, OPE/L, this is solved by doubly robust (DR) methods, but they do not naturally extend to the more complex DROPE/L, which involves a worst-case expectation. In this paper, we propose the first DR algorithms for DROPE/L with KL-divergence uncertainty sets. For evaluation, we propose Localized Doubly Robust DROPE (LDR$^2$OPE) and show that it achieves semiparametric efficiency under weak product rates conditions. Thanks to a localization technique, LDR$^2$OPE only requires fitting a small number of regressions, just like DR methods for standard OPE. For learning, we propose Continuum Doubly Robust DROPL (CDR$^2$OPL) and show that, under a product rate condition involving a continuum of regressions, it enjoys a fast regret rate of $O(N^{-1/2})$ even when unknown propensities are nonparametrically estimated. We empirically validate our algorithms in simulations and further extend our results to general $f$-divergence uncertainty sets.
Focusing on diagonal linear networks as a model for understanding the implicit bias in underdetermined models, we show how the gradient descent step size can have a large qualitative effect on the implicit bias, and thus on generalization ability. In particular, we show how using large step size for non-centered data can change the implicit bias from a "kernel" type behavior to a "rich" (sparsity-inducing) regime — even when gradient flow, studied in previous works, would not escape the "kernel" regime. We do so by using dynamic stability, proving that convergence to dynamically stable global minima entails a bound on some weighted $\ell_1$-norm of the linear predictor, i.e. a "rich" regime. We prove this leads to good generalization in a sparse regression setting.
This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied COincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021.
Remote photoplethysmography (rPPG), a family of techniques for monitoring blood volume changes, may be especially useful for contactless health monitoring via face videos from consumer-grade cameras. The COVID-19 pandemic caused widespread use of protective face masks, which results in a domain shift from the typical region of interest. In this paper we show that augmenting unmasked face videos by adding patterned synthetic face masks forces the deep learning-based rPPG model to attend to the periocular and forehead regions, improving performance and closing the gap between masked and unmasked pulse estimation. This paper offers several novel contributions: (a) deep learning-based method designed for remote photoplethysmography in a presence of face masks, (b) new dataset acquired from 54 masked subjects with recordings of their face and ground-truth pulse waveforms, (c) data augmentation method to add a synthetic mask to a face video, and (d) evaluations of handcrafted algorithms and two 3D convolutional neural network-based architectures trained on videos of unmasked faces and with masks synthetically added.