Full Record

New Search | Similar Records

Title Applications of Integer Programming Methods to Solve Statistical Problems
Publication Date
Discipline/Department Statistics
University/Publisher University of California – Berkeley
Abstract Many problems in statistics are inherently discrete. When one of these problems also contains an optimization component, integer programming may be used to facilitate a solution to the statistical problem. We use integer programming techniques to help solve problems in the following areas: optimal blocking of a randomized controlled experiment with several treatment categories and statistical auditing using stratified random samples.We develop a new method for blocking in randomized experiments that works for an arbitrary number of treatments. We analyze the following problem: given a threshold for the minimum number of units to be contained in a block, and given a distance measure between any two units in the finite population, block the units so that the maximum distance between any two units within a block is minimized. This blocking criterion can minimize covariate imbalance, which is a common goal in experimental design. Finding an optimal blocking is an NP-hard problem. However, using ideas from graph theory, we provide the first polynomial time approximately optimal blocking algorithm for when there are more than two treatment categories. In the case of just two such categories, our approach is more efficient than existing methods. We derive the variances of estimators for sample average treatment effects under the Neyman-Rubin potential outcomes model for arbitrary blocking assignments and an arbitrary number of treatments. In addition, statistical election audits can be used to collect evidence that the set of winners (the outcome) of an election according to the machine count is correct – that it agrees with the outcome that a full hand count of the audit trail would show. The strength of evidence is measured by the <italic>p<\italic>-value of the hypothesis that the machine outcome is wrong. Smaller <italic>p<\italic>-values are stronger evidence that the outcome is correct. Most states that have election audits of any kind require audit samples stratified by county for contests that cross county lines. Previous work on <italic>p<\italic>-values for stratified samples based on the largest weighted overstatement of the margin used upper bounds that can be quite weak. Sharper $p$-values than those found by previous work can be found by solving a 0-1 knapsack problem. We also give algorithms for choosing how many batches to draw from each stratum to reduce the counting burden.
Subjects/Keywords Statistics; Blocking; Causal inference; Election auditing; Exact inference; Experimental design; Integer programming
Language en
Rights public
Country of Publication us
Format application/pdf
Record ID california:qt1g55c77q
Other Identifiers qt1g55c77q
Repository california
Date Indexed 2018-02-26

Sample Search Hits | Sample Images

…larger than the one observed under an exact null hypothesis. Frequently in causal inference, subjects in one treatment group are matched to one or more subjects in another treatment group based on the units’ propensity scores or observed values of…

…between units. Note that, for these twelve observations, the approximate algorithm yields the desired optimal blocking in figure 2.2. In general, this will not be the case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Exact and…

exact p-values (P# ). Both are substantially smaller than bounds using the method in Stark (2008b). . 38 Number of batches to audit and expected number of ballots to audit to get p-values no larger than 0.05 for 2006 Minnesota Senate…

…46 vi List of Tables 3.1 Conservative and exact p-values for the hypothesis that the apparent outcome of the 2006 U.S. Senate race in Minnesota was wrong, based on Minnesota’s audit of a stratified random sample of 202 precincts. Values are given…

…for two test statistics: maximum MRO and maximum taint. Column 2: conservative p-value using the method of Stark (2008b). Column 3: LKP conservative p-value. Column 4: exact p-value obtained by solving KP…

…37 3.2 Statutory, PSS, first.r, and next.r sample sizes for the 2006 Minnesota Senate contest. Number of batches to audit and expected number of ballots to audit to obtain p-values no larger than the exact p-values in Table 3.1 (0.0159 for…

…efficiently assign a finite number of units to pre-selected treatment categories. Nonparametric inference problems may involve computing ranks and counting the number of permutations of treatment assignments that would produce a test statistic as large or…

…The core of the paper presents an exact substantive test of details when units are selected using a simple random sample. A substantive test of details is a type of audit that tests whether the aggregate discrepancy in a set of audited units is…