We adjust the bias of the bootstrap probability by utilizing the scaling law in terms of geometric quantities of the region in the abstract parameter space, namely, signed distance and mean curvature. Our computation is based on a raw confidence measure, called bootstrap probability, which is easily obtained by counting how many times the same cluster appears in bootstrap replicates of the dendrogram. ![]() This approach is useful for hierarchical clustering, where confidence levels of clusters are calculated only for those appearing in the dendrogram, subject to heavy selection bias. ![]() Title: Selective inference via multiscale bootstrap and its applicationĪbstract: We consider a general approach to selective inference for hypothesis testing of the null hypothesis represented as an arbitrarily shaped region in the parameter space of the multivariate normal model. We theoretically prove and empirically demonstrate their improved approximation capabilities. To overcome the limitation, we propose novel similarities called shifted inner product similarity (SIPS) and weighted inner product similarity (WIPS) for the siamese NN. Whereas a prevailing line of previous works has applied the inner-product similarity (IPS) to the neural network outputs, the overall Siamese NN is limited to approximate only the positive-definite similarities. Title: Approximation Capability of Graph Embedding using Siamese Neural NetworkĪbstract: In this talk, we present our studies on the approximation capability of graph embedding using the Siamese neural network (NN). I show the computation of distributed representations for logical operations including AND, OR, and NOT, which would be a basis for implementing “advanced thinking” by AI in the future. In the second half of the talk, I discuss a generalization of “additive compositionality” of word embedding in natural language processing. Title: Statistical Intelligence for Advanced Artificial IntelligenceĪbstract: Our goal is to develop a data-driven methodology with statistical inference for artificial intelligence, which may be called “statistical intelligence.” In the first half of the talk, I overview our research topics: (1) Representation learning via graph embedding for multimodal relational data, (2) Valid inference via bootstrap resampling for many hypotheses with selection bias, (3) Statistical estimation of growth mechanism from complex networks. Speaker 1: Hidetoshi Shimodaira (30 mins) Mathematical Statistics Team ( ) at RIKEN AIP This seminar introduces the research of the Mathematical Statistics Team.ħ 15:00-17:00 Online Streaming via Zoom Webinar (registration required) His “covariate shift” setting for transfer learning is popular in machine learning. ![]() His multiscale bootstrap method is used in genomics for evaluating the statistical significance of trees and clusters. He has been working on theory and methods of statistics and machine learning. Hidetoshi Shimodaira is a professor at Kyoto University and a team leader at RIKEN AIP. Examples are shown for the confidence interval of regression coefficients after model selection and significance levels of trees and edges in hierarchical clustering and phylogenetic inference. The theory is based on a geometric idea in the data space, which bridges Bayesian posterior probability to the frequentist p-value. In this talk, I present a bootstrap resampling method with a “negative sample size” for adjusting these two types of selection bias. ![]() On the other hand, we also face biased p-values in multiple testing, although it is a different type of selection bias. Recently, a new statistical method, called selective inference or post-selection inference, has been developed for adjusting this selection bias. However, people tend to use datasets twice for hypothesis selection and evaluation, leading to inflated statistical significance and more false positives than expected. Title: Selection bias may be adjusted when the sample size is negative in hierarchical clustering, phylogeny, and variable selectionįor computing p-values, you should specify hypotheses before looking at data. EPFL CIS-RIKEN AIP Joint Seminar #6 20211215ĭate and Time: December 15th 6:00pm – 7:00pm(JST)
0 Comments
Leave a Reply. |