Center for Statistics and Applications in Forensic Evidence
facilityAmes, United States
Research output, citation impact, and the most-cited recent papers from Center for Statistics and Applications in Forensic Evidence. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Center for Statistics and Applications in Forensic Evidence
Abstract In the past decade, and in response to the recommendations set forth by the National Research Council Committee on Identifying the Needs of the Forensic Sciences Community (2009), scientists have conducted several black-box studies that attempt to estimate the error rates of firearm examiners. Most of these studies have resulted in vanishingly small error rates, and at least one of them (D. P. Baldwin, S. J. Bajic, M. Morris, and D. Zamzow. A Study of False-Positive and False-Negative Error Rates in Cartridge Case Comparisons. Technical report, Ames Lab IA, Performing, Fort Belvoir, VA, April 2014.) was cited by the President’s Council of Advisors in Science and Technology (PCAST) during the Obama administration, as an example of a well-designed experiment. What has received little attention, however, is the actual calculation of error rates and in particular, the effect of inconclusive findings on those error estimates. The treatment of inconclusives in the assessment of errors has far-reaching implications in the legal system. Here, we revisit several black-box studies in the area of firearms examination, investigating their treatment of inconclusive results. It is clear that there are stark differences in the rate of inconclusive results in regions with different norms for training and reporting conclusions. More surprisingly, the rate of inconclusive decisions for materials from different sources is notably higher than the rate of inconclusive decisions for same-source materials in some regions. To mitigate the effects of this difference we propose a unifying approach to the calculation of error rates that is directly applicable in forensic laboratories and in legal settings.
A blind quality control (QC) program was successfully developed and implemented in the Toxicology, Seized Drugs, Firearms, Latent Prints (Processing and Comparison), Forensic Biology, and Multimedia (Digital and Audio/Video) sections at the Houston Forensic Science Center (HFSC). The program was put into practice based on recommendations set forth in the 2009 National Academy of Sciences report and is conducted in addition to accreditation required annual proficiency tests. The blind QC program allows HFSC to test its entire quality management system and provides a real-time assessment of the laboratory's proficiency. To ensure the blind QC cases mimicked real casework, the workflow for each forensic discipline and their evidence submission processes were assessed prior to implementation. Samples are created and submitted by the HFSC Quality Division to whom the expected answer is known. Results from 2015 to 2018 show that of the 973 blind samples submitted, 901 were completed, and only 51 were discovered by analysts as being blind QC cases. Implementation data suggests that this type of program can be employed at other forensic laboratories.
Open proficiency tests meet accreditation requirements and measure examiner competence but may not represent actual casework. In December 2015, the Houston Forensic Science Center began a blind quality control program in firearms examination. Mock cases are created to mimic routine casework so that examiners are unaware they are being tested. Once the blind case is assigned to an examiner, the evidence undergoes microscopic examination and comparison to determine whether the fired evidence submitted was fired in the same firearm. Fifty-one firearms blind cases resulting in 570 analysis and comparison determinations were reported between December 2015 and June 2021. No unsatisfactory results were obtained; however, 40.3% of comparisons in which the ground truth was either elimination or identification resulted in inconclusive conclusions. Due to the quality of some of the evidence submitted, inconclusive results were not unexpected. A ground truth of elimination and comparison result of inconclusive was observed at a rate of 74%, while a ground truth of identification and comparison result of inconclusive was observed at a rate of 31%. Bullets (61.8%) were the main contributors to inconclusive conclusions; variables such as the assigned examiners, training program, examiner experience, and the intended complexity of the case did not significantly contribute to the results. The program demonstrates that the quality management system and firearms section procedures can obtain accurate and reliable results and provides examiners added confidence in court. Additionally, the program can be tailored to target specific research questions and provide opportunities for collaboration with other laboratories and researchers.
Forensic practitioners have long relied on their expertise in judging evidence from a crime scene. But the reliance on expert judgement can be problematic as it is inherently subjective. By Karen Kafadar.
Declared proficiency tests are limited in their use for testing the performance of the entire system, because analysts are aware that they are being tested. A blind quality control (BQC) is intended to appear as a real case to the analyst to remove any intentional or subconscious bias. A BQC program allows a real-time assessment of the laboratory's policies and procedures and monitors reliability of casework. In September 2015, the Houston Forensic Science Center (HFSC) began a BQC program in blood alcohol analysis. Between September 2015 and July 2018, HFSC submitted 317 blind cases: 89 negative samples and 228 positive samples at five target concentrations (0.08, 0.15, 0.16, 0.20 and 0.25 g/100 mL; theoretical targets). These blood samples were analyzed by a headspace gas chromatograph interfaced with dual-flame ionization detectors (HS-GC-FID). All negative samples produced `no ethanol detected' results. The mean (range) of reported blood alcohol concentrations (BACs) for the aforementioned target concentrations was 0.075 (0.073-0.078), 0.144 (0.140-0.148), 0.157 (0.155-0.160), 0.195 (0.192-0.200) and 0.249 (0.242-0.258) g/100 mL, respectively. The average BAC percent differences from the target for the positive blind cases ranged from -0.4 to -6.3%, within our uncertainty of measurement (8.95-9.18%). The rate of alcohol evaporation/degradation was determined negligible. A multiple linear regression analysis was performed to compare the % difference in BAC among five target concentrations, eight analysts, three HS-GC-FID instruments and two pipettes. The variables other than target concentrations showed no significant difference (P > 0.2). While the 0.08 g/100 mL target showed a significantly larger % difference than higher target concentrations (0.15-0.25 g/100 mL), the % differences among the higher targets were not concentration-dependent. Despite difficulties like gaining buy-in from stakeholders and mimicking evidence samples, the implementation of a BQC program has improved processes, shown methods are reliable and added confidence to staff's testimony in court.
We develop a statistical approach to model handwriting that accommodates all styles of writing (cursive, print, connected print). The goal is to compute a posterior probability of writership of a questioned document given a closed set of candidate writers. Such probabilistic statements can support examiner conclusions and enable a quantitative forensic evaluation of handwritten documents. Writing is treated as a sequence of disjoint graphical structures, which are extracted using an automated and open-source process. The graphs are grouped based on the similarity of their shapes through a K-means clustering template. A person's writing pattern can be characterized by the rate at which graphs are emitted to each cluster. The cluster memberships serve as data for a Bayesian hierarchical model with a mixture component. The rate of mixing between two parameters in the hierarchy indicates writing style.
Land engraved areas (LEAs) provide evidence to address the same source-different source problem in forensic firearms examination. Collecting 3D images of bullet LEAs requires capturing portions of the neighboring groove engraved areas (GEAs). Analyzing LEA and GEA data separately is imperative to accuracy in automated comparison methods such as the one developed by Hare et al. (Ann Appl Stat 2017;11, 2332). Existing standard statistical modeling techniques often fail to adequately separate LEA and GEA data due to the atypical structure of 3D bullet data. We developed a method for automated removal of GEA data based on robust locally weighted regression (LOESS). This automated method was tested on high-resolution 3D scans of LEAs from two bullet test sets with a total of 622 LEA scans. Our robust LOESS method outperforms a previously proposed "rollapply" method. We conclude that our method is a major improvement upon rollapply, but that further validation needs to be conducted before the method can be applied in a fully automated fashion.
The same-source problem remains a major challenge in forensic toolmark and firearm examination. Here, we investigate the applicability of the Chumbley method (J Forensic Sci, 2018, 63, 849; J Forensic Sci, 2010, 55, 953) (10,12), developed for screwdriver markings, for same-source identification of striations on bullet LEAs. The Hamby datasets 44 and 252 measured by NIST and CSAFE (high-resolution scans) are used here. We provide methods to identify parameters that minimize error rates for matching of LEAs, and a remedial algorithm to alleviate the problem of failed tests, while increasing the power of the test and reducing error rates. For 85,491 land-to-land comparisons (84,235 known nonmatches and 1256 known matches), the adapted test does not provide a result in 176 situations (originally more than 500). The Type I and Type II error rates are 7.2% (6105 out of 84,235) and 21.4% (271 out of 1256), respectively. This puts the proposed method on similar footing as other single-feature matching approaches in the literature.
This study involved 71 forensic seized drug laboratories analyzing 65 total samples; 17 were ground-truth positive (i.e., they contained methamphetamine or cocaine); 48 were ground-truth negative (i.e., they did not contain methamphetamine or cocaine). The positive samples were prepared at several target-analyte concentrations and combined with common cutting agents. The negative samples were designed to be challenging and prepared to contain positional isomers of methamphetamine. Participants were sent two different sample sets. In the first, they were directed to only use a single, pre-selected analytical technique. In the second, they were directed to use a pre-selected analytical scheme consisting of multiple techniques in compliance with ASTM E2329-17. The results of the study showed good accuracy; sensitivity was 1.000 for all analytical schemes with 1-specificity (the false-positive rate) ranging from 0.000 to 0.250 when ASTM E2329-17 compliant analytical schemes were used. When only a single technique was used, accuracy was generally not as good; sensitivity ranged from 1.000 to 0.091, and 1-specificity ranged from 0.000 to 0.245.
Scientific research is driven by our ability to use methods, procedures, and materials from previous studies and further research by adding to it. As the need for computationally-intensive methods to analyze large amounts of data grows, the criteria needed to achieve reproducibility, specifically computational reproducibility, have become more sophisticated. In general, prosaic descriptions of algorithms are not detailed or precise enough to ensure complete reproducibility of a method. Results may be sensitive to conditions not commonly specified in written-word descriptions such as implicit parameter settings or the programming language used. To achieve true computational reproducibility, it is necessary to provide all intermediate data and code used to produce published results. In this paper, we consider a class of algorithms developed to perform firearm evidence identification on cartridge case evidence known as the Congruent Matching Cells (CMC) methods. To date, these algorithms have been published as textual descriptions only. We introduce the first open-source implementation of the Congruent Matching Cells methods in the R package cmcR. We have structured the cmcR package as a set of sequential, modularized functions intended to ease the process of parameter experimentation. We use cmcR and a novel variance ratio statistic to explore the CMC methodology and demonstrate how to fill in the gaps when provided with computationally ambiguous descriptions of algorithms. applied disciplines. In forensics in particular, it is easier to list the exceptions: reproducible algorithms have been proposed in sub-disciplines including DNA (Tvedebrink, Andersen, and Curran 2020; Goor, Hoffman, and Riley 2020; Tyner et al. 2019), glass (Curran, Champod, and Buckleton 2000;
Hal S. Stern, Maria Cuellar and David Kaye describe how scientists define and assess the reliability and validity of some commonly encountered types of forensic science evidence. Such assessments are necessary for courts to admit putatively scientific evidence as bona fide and legally “reliable” science.
When wires are cut, the tool produces striations on the cut surface; as in other forms of forensic analysis, these striation marks are used to connect the evidence to the source that created them. Here, we argue that the practice of comparing two wire cut surfaces introduces complexities not present in better-investigated forensic examination of toolmarks such as those observed on bullets, as wire comparisons inherently require multiple distinct comparisons, increasing the expected false discovery rate. We call attention to the multiple comparison problem in wire examination and relate it to other situations in forensics that involve multiple comparisons, such as database searches.