skip to content

Timetable (FOSW03)

Statistical modelling of scientific evidence

Monday 7th November 2016 to Friday 11th November 2016

Monday 7th November 2016
12:00 to 12:30 Registration
12:30 to 13:30 Lunch @ Wolfson Court
13:40 to 13:45 Welcome from Christie Marr (INI Deputy Director) INI 1
13:45 to 14:00 Introduction INI 1
14:00 to 15:00 Gerd Gigerenzer (Max-Planck-Institut für Bildungsforschung)
Risk Literacy: How to Make Sense of Statistical Evidence
We are taught reading and writing, but rarely statistical thinking. Law schools and medical schools have not yet taken sufficient efforts to teach their students how to understand and communicate statistical evidence. The result is collective risk illiteracy: many judges, lawyers, doctors, journalists and politicians do not understand statistical evidence and draw wrong conclusions unawares. Focusing on legal and medical evidence, I will discuss common errors in evaluating evidence and efficient tools that help professionals to overcome these. Risk literacy, together with efficient techniques for communicating statistical information, is a necessary precondition for meeting the challenges of modern technological society. 
15:00 to 15:30 Afternoon Tea
15:30 to 16:15 Geoffrey Stewart Morrison (University of Alberta)
What should a forensic scientist's likelihood ratio be?

How should a forensic scientist arrive at a value for the strength of evidence statement that they present to the court? A number of different answers have been proposed.

One proposal is to assign probabilities based on experience and subjective judgement. This appears to be advocated in the Association of Forensic Science Providers (AFSP) 2009 standards, and the 2015 European Network of Forensic Science Institutes (ENFSI) guideline on evaluative reporting. But the warrant for such subjective judgements has been questioned. The 1993 US Supreme Court Daubert ruling and the 2016 report by the President’s Council of Advisors on Science and Technology (PCAST) argue strongly that subjective judgment is not enough, that empirical validation is needed.

If a forensic likelihood ratio is to be based on subjective judgement, it has been proposed that the judgement be empirically calibrated.

The PCAST report proposes a procedure which results in a dichotomous likelihood ratio. The practitioner applies a threshold and declares “match” or “non-match”. If a “match” is declared, the empirically derived correct acceptance  rate and false acceptance rate are also provided (dividing the former by the latter would produce a likelihood ratio). Mutatis mutandis if a “non-match” is declared. This has been criticised for discarding information and thus resulting in poor performance.

The AFSP standards and ENFSI guideline propose the use of ordinal scales – each level on the scale covers a pre-specified range of likelihood ratio values, and has an associated verbal expression. These have been criticised on a number of grounds, including for having arbitrary ranges, for suffering from cliff-edge effects, and for verbal expressions being vague – they will be interpreted differently by different individuals, and differently by the same individual in different contexts.

It has also been proposed that numeric likelihood ratios be calculated on the basis of relevant data, quantitative measurements, and statistical models, and that the numeric likelihood ratio output of the statistical model be directly reported as the strength of evidence statement. Such an approach is transparent and replicable, and, relative to procedures based primarily on subjective judgement, it is easier to empirically calibrate and validate under conditions reflecting those of the case under investigation, and it is more resistant to cognitive bias.

Score based procedures first calculate a score which quantifies degree of similarity (or difference) between pairs of objects, then applies a subsequent model which converts scores to likelihood ratios (the second stage can be considered an empirical calibration stage). Scores which only take account of similarity (or difference), however, do not account for typicality with respect the relevant population for the case, and this cannot be corrected at the score to likelihood ratio conversion stage. If a score based procedure is used, the scores should take account of both similarity and typicality.

Numeric likelihood ratios can be calculated in a frequentist manner or a subjectivist Bayesian manner. Philosophically the former is an estimate of a true but unknown value, and the latter is a state of belief, a personal probability. A frequentist will assess the precision of their estimate, whereas a subjectivist Bayesian will have attempted to account for all sources of uncertainty in the assignment of the value of their likelihood ratio (a Bayes factor). The merits of the two approaches are hotly debated (including currently in a virtual special issue in Science & Justice), but if presented with a frequentist point estimate plus degree of precision the trier of fact may decide to use a likelihood ratio closer to 1 than the point estimate (the deviation depending on the degree of precision), and (depending on the prior used) the value of the Bayes factor will be closer to 1 than a frequentist point estimate of a likelihood ratio. Can these be considered to have the same practical result? Which would be preferred by the courts? Can Bayesian procedures with empirical or reference priors be adopted without having to buy in to the subjectivist philosophy? What should a forensic scientist’s likelihood ratio be?

Download presentation slides from:

16:15 to 17:00 Daniel Ramos (Universidad Autonoma de Madrid)
Measuring Performance of Likelihood-Ratio-Based Evidence Evaluation Methods

The value of the evidence in forensic science is increasingly expressed by a Likelihood Ratio (LR), following a Bayesian framework. The LR aims at providing the information given by the evidence to the decision process in a trial. Although theoretical aspects of statistical models are essential to compute the LR, in real forensic situations there exist many other factors (including e.g. data sparsity, data variability and dataset shift) that might degrade the performance of the LR. This means that the computed LR values might be misleading, ultimately causing a loss in the accuracy of the decisions made by the fact finder. Therefore, it is essential to measure the performance of LR methods in forensic situations, with the further objective of validating LR methods for its use in casework. In this talk, we will present several popular performances measures for LR values. We will provide examples where these measures are used to compare different methods in the context of trace evidence. Finally, we will present a recently-proposed guideline for the validation of LR methods in forensic science, that relies upon the use of performance measures of LR methods.

17:00 to 18:00 Welcome Wine Reception at INI
Tuesday 8th November 2016
09:30 to 10:15 Gabriel Vivó-Truyols (Universiteit van Amsterdam)
Interpreting (chemical) forensic evidence in a Bayesian framework: a multidisciplinary task
Co-authors: Marjan Sjerps (University of Amsterdam / Dutch Forensic Institute), Martin Lopatka (University of Amsterdam / Dutch Forensic institute), Michael Woldegebriel (University of Amsterdam), Andrei Barcaru (University of Amsterdam)
The interpretation and evaluation of chemical forensic evidence is a challenging task of multidisciplinary nature. Interaction between diferent disciplines (Bayesian statisticians, analytical chemists, signal processers, instrument expers, etc.) is necessary. In this talk I will illustrate different cases of such interaction to evaluate pieces of evidence in a forensic context: The first case is the evaluation of fire debris using two-dimensional gas chromatography. Such a technique analyses fire debris to look for traces of (pyrolized) hydrocarbons. However, the classification of such hydrocarbons is a difficult task, demanding experts in (analytical) chemistry. Even more difficult is to interpret such evidence in a Bayesian framework. The second case is the application of Bayesian inference in the toxicology laboratory. In this case, a set of targeted compounds is analysed via LC-MS. Instruments are normally pre-processing the data in a deterministic manner, providing the so-called peak table. We propose an alternative that uses the raw data as evidence, instead of using such peak table. The third case is the exploration of differences between different analysis, in order to find illegal additives in a complex matrix. In this case, the use of Jensen-Shannon divergence has been applied in a Bayesian framework to highlight such differences.
10:15 to 11:00 William Thompson (University of California, Irvine)
Elicitation of Priors in Bayesian Modeling of DNA Evidence

Bayesian networks have been helpful for analyzing the probative value of complex forms of forensic DNA evidence, but some of these network models require experts to estimate the prior probability of specific events. This talk discusses procedures that might be used for elicitation of priors with an eye toward minimizing bias and error. As an illustration it uses a model proposed by Biedermann, Taroni & Thompson (2011) to deal with situations in which the "inclusion" of the suspect as a possible contributor to a mixed DNA sample depends on the value of an unknown variable. (Biedermann, A., Taroni, F. & Thompson, W.C. Using graphical probability analysis (Bayes nets) to evaluate a conditional DNA inclusion. Law, Probability and Risk, 10: 89-121, 2011). 

11:00 to 11:30 Morning Coffee
11:30 to 12:15 John Aston (University of Cambridge)
Inverting Entomological Growth Curves Using Functional Data Analysis


Co-authors: Davide Pigoli (University of Cambridge), Anjali Mazumder (Carnegie Mellon University), Frederic Ferraty (Toulouse Jean Jaures University), Martin Hall (Natural History Museum)
It is not unusual in cases where a body is discovered that it is necessary to determine a time of death or more formally a post mortem interval (PMI). Forensic entomology can be used to estimate this PMI by examining evidence obtained from the body from insect larvae growth. Growth curves however are temperature dependent, and usually direct temperature measurements from the body location are unavailable for the time periods of interest. In this work, we investigate models for PMI estimation, including temperature prediction, based on functional data analysis. We will evaluate the possibilities of using different models, particularly based on ideas from function registration, to try to obtain inferences concerning PMI and indeed whether multiple species data can be incorporated into the model. This can allow even more accurate estimation of PMI.


12:15 to 13:00 Sue Pope
Modelling the best evidence

This talk will consider the benefits for the courts of maximising the amount of information gained for a result before statistical modelling is carried out, using complex DNA results as an example. Some mixtures of DNA obtained from crime and questioned samples are either too complex or too partial to give meaningful likelihood ratios even with the wider range of calculation software now available. One current option of providing the expert’s qualitative opinion in place of a formal likelihood ratio, while sanctioned by the courts, has been controversial. The time and effort spent improving the amount and quality of DNA being analysed before attempting to carry out specialist DNA mixture calculations will be repaid by achieving a more discriminating likelihood ratio. The relevance of the samples to the evidential issues will also be discussed.  

13:00 to 13:30 Lunch @ Wolfson Court
14:00 to 15:00 Optional Discussion Forum: Discussion Room
15:00 to 15:30 Afternoon Tea
15:30 to 16:15 Norah Rudin
Complex DNA profile interpretation: stories from across the pond
A story of samples and statistics: The history of a forensic sample, the history and current state of forensic DNA interpretation and statistics in the U.S.

With the continued increase in the sensitivity of DNA testing systems comes a commensurate increase in the complexity of the profiles generated. Numerous sophisticated statistical tools intended to provide an appropriate weight of evidence for these challenging samples have emerged over the last several years.  While it seems clear that only a likelihood ratio-based probabilistic genotyping approach is appropriate to address the ambiguity inherent in these complex samples, the relative merits of the different approaches are still being investigated.
The first part of this talk will address the generation of DNA samples from a forensic science perspective.  Long before the statistical weight of evidence is considered, numerous decision points determine what samples are collected, what samples are tested, how they are tested and what questions are asked of them.  It is critical to understand the sample history and the milieu in which their journey takes place on their way to becoming profiles that require interpretation and statistical assessment. We will then summarize the history of approaches typically used by working analysts in the US, and discuss the current state of the practice. In 2005 and 2013, NIST distributed sets of mixtures to working laboratories and collected their interpretations and statistical weights. They found a wide range of variation both within and between laboratories in calculating the weight of evidence for the same sample in both surveys.  Most disturbing was the continued use of simplistic tools, such as the CPI/CPE (RMNE), long considered inadequate for specific types of profiles. A number of publications and reports over the last 15 years have commented on the interpretation and statistical weighting of forensic DNA profiles. These include the ISFG commission papers of 2006 and 2012, the NAS 2009 report, the 2010 SWGDAM STR interpretation guidelines, and the 2015 SWGDAM probabilistic genotyping software validation guidelines. Several high profiles criticisms of laboratory protocols (e.g. Washington D.C. and the TX laboratory system) have emerged that have fueled debate. Most recently, PCAST published a report commenting on the state of forensic science disciplines in the US, including DNA. An updated draft of the SWGDAM STR interpretation guidelines is currently posted for comment.  We will discuss these various publications and commentaries as time permits.
16:15 to 17:00 Keith Inman (California State University)
Complex DNA profile interpretation: stories from across the pond
A comparison of complex profiles analyzed with different software tools

The second part of this talk will document the creation and evaluation of ground-truth samples of increasing complexity. Variables include: the number of contributors, the amount of allele sharing, variability in template amount, and varying mixture ratios of contributors in mixed samples. These samples will be used to evaluate the efficacy of four open-source implementations of likelihood ratio approaches to estimating the weight of evidence: Lab Retriever (an implementation of likeLTD v2), LRMix Studio, European Forensic Mixture, and likeLTD v6. This work was initiated during the summer of 2016 in conjunction with the special semester at the Newton Institute devoted to Probability and Statistics in Forensic Science, and continue at the time of submission of this abstract. We look forward to presenting results “hot off the press” at this meeting.
Wednesday 9th November 2016
09:30 to 10:15 Roberto Puch-Solis
Evaluation of forensic DNA profiles while accounting for one and two repeat less and two repeat more stutters


Co-author: Dr Therese Graversen (University of Copenhagen)
Current forensic DNA profile technology is very sensitive and can produce profiles from a minute amount of DNA, e.g. from one cell. A profile from a stain recovered from a crime scene is represented through an electropherogram (epg), which consists of peaks located in positions corresponding to alleles. Peak heights are related to the originating amount of DNA: the more DNA the sample contains, the taller the peaks are.

An epg also tends to contain artefactual peaks of different kinds. Some of these artefacts originate during PCR duplication and are usually called ‘stutters’. The most predominant of the stutter appears one STR less to the corresponding alleles and it is about 10% of the height of the allelic peak, although this percentage vary from locus to locus. Given the sensitivity of the DNA systems, other stutters also tend to appear in the epg: one located two STR less and the other one STR more of the allelic peak. They tend to be much smaller than their corresponding one STR less stutters.

Many stain profiles from samples taken from a scene of a crime originate from more than one person where each of them contributes different amounts of DNA. The peaks of minor contributors can be about the same height of the stutters of a major contributor. A stutter could also combine with an allelic peak or with other stutters, making an evaluation more complicated. Caseworkers are also scrutinised on their stutters designations in court.

Graversen & Lauritzen (2015) introduced an efficient method for calculating likelihood ratios using Bayesian Networks. In this talk, this method is extended to consider two STR less and one STR more stutters, and the complexities of the extension is discussed.


Graversen T. & Lauritzen S. (2015). Computational aspects of DNA mixture analysis: exact inference using auxiliary variables in a Bayesian network. Statistics & Computing 25, pp. 527-541. 


10:15 to 11:00 Therese Graversen (Københavns Universitet (University of Copenhagen))
An exact, efficient, and flexible representation of statistical models for DNA profiles

Many different models have been proposed for a statistical interpretation of mixed DNA profiles. Regardless of the model, a computational bottleneck lies in the necessity to handle the large set of possible combinations of DNA profiles for the contributors to the DNA sample.

I will explain how models can be specified in a very general setup that makes it simple to compute both the likelihood and many other quantities that are crucial to a thorough statistical analysis. Notably all computations in this setup are exact, whilst still efficient.

I have used this setup to implement the statistical software package DNAmixtures.

Related Links

11:00 to 11:30 Morning Coffee
11:30 to 12:00 Jacob de Zoete (Universiteit van Amsterdam)
Cell type determination and association with the DNA donor
Co-authors: Wessel Oosterman (University of Amsterdam), Bas Kokshoorn (Netherlands Forensic Institute), Marjan Sjerps (Korteweg-de Vries Institute for Mathematics, University of Amsterdam)

In forensic casework, evidence regarding the type of cell material contained in a stain can be crucial in determining what happened. For example, a DNA match in a sexual offense can become substantially more incriminating when there is evidence supporting that semen cells are present.

Besides the question which cell types are present in a sample, also the question who donated what (association) is very relevant. This question is surprisingly difficult, even for stains with a single donor. The evidential value of a DNA profile needs to be combined with knowledge regarding the specificity and sensitivity of cell type tests. This, together with prior probabilities for the different donor-cell type combinations, determines the most likely combination.

We present a Bayesian network that can assist in associating donors and cell types. A literature overview on the sensitivity and specificity of three cell type tests (PSA test for seminal fluid, RSID saliva and RSID semen) is helpful in assigning conditional probabilities. The Bayesian network is linked with a software package for interpreting mixed DNA profiles. This allows for a sensitivity analysis that shows to what extent the conclusion depends on the quantity of available research data. This can aid in making decisions regarding further research.

It is shown that the common assumption that an individual (e.g. the victim) is one of the donors in a mixed DNA profile can have unwanted consequences for the association between donors and cell types.
12:00 to 12:30 Maarten Kruijver (Unknown); (Vrije Universiteit Amsterdam)
Modeling subpopulations in a forensic DNA database using a latent variable approach
Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach by using the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three person mixtures. We also study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach.
12:30 to 13:00 Torben Tvedebrink (Aalborg Universitet); (Københavns Universitet (University of Copenhagen))
Inference platform for Ancestry Informative Markers
Co-authors: Poul Svante Eriksen (Aalborg University), Helle Smidt Mogensen (University of Copenhagen), Niels Morling (University of Copenhagen)

In this talk I will present a platform for making inference about Ancestry Informative Markers (AIMs), which are a panel of SNP markers used in forensic genetics to infer the population origin of a given DNA profile.

Several research groups have proposed such AIM panels, each with a specific objective in mind. Some were designed to discriminate between closely related ethnic groups whereas other focus on larger distances (more remotely located populations). This talk is not about selecting markers or populations for testing. The focus will be about how to provide forensic geneticists with a tool that can be used to infer the most likely population(s) of a given DNA profile.

By the use of R ( and Shiny (web applications framework for R, RStudio) I have developed a platform that provides the numerical and visual output necessary for the geneticist to analyse and report the genetic evidence.

In the talk I will discuss the evidential weight in this situation and uncertainties in population frequencies. As the database of populations is not exhaustive, there is no guarantee that there exists a \textsl{relevant} population in the database, where \textsl{relevant} means a population sufficiently close to the \textsl{true} population. We derive a database score specific for each DNA profile and use this score to assess the relevance of the database relative to the DNA profile.
13:00 to 13:30 Lunch @ Wolfson Court
13:30 to 17:00 Social activity
17:00 to 18:00 Bernard Silverman
Pre-Dinner Lecture: Forensic Science from the point of view of a Scientific Adviser to Government
I will give some examples of ways in which mathematical and statistical issues relevant to Forensic Science have arisen during my seven years as Chief Scientific Adviser to the Home Office, and also reflect more widely on the role of a scientist appointed to this role in Government. Some of the aspects of my 2011 report into research into Forensic Science remain relevant today and indeed pose challenges for all the participants in this conference. I hope that my talk will also be an opportunity to thank the Newton Institute for playing its part, alongside other national and international research organisations, in raising the profile of this important discipline.
18:45 to 22:00 Pre-Dinner Drink followed by Formal Dinner at Emmanuel College
Thursday 10th November 2016
09:30 to 10:15 Mikkel Andersen (Aalborg Universitet)
Y Chromosomal STR Markers: Assessing Evidential Value

Y chromosomal short tandem repeats (Y-STRs) are widely used in forensic genetics. The current application is mainly to detect non-matches, and subsequently release wrongly accused suspects. For matches the situation is different. For now, most analysts will just say that the haplotypes matched but they will not assess the evidential value of this match. This is understandable given the fact that a consensus of estimating the evidential value has not yet been reached. However, work on getting there is in progress. In this talk, the aim is to review some of the current methods for assessing the evidential value of a Y-STR match. This includes proposal for a new way to compare methods estimating match probabilities and a discussion of correcting for population substructure through the so-called θ (theta) method.

10:15 to 11:00 Amke Caliebe (Christian-Albrechts-Universität zu Kiel)
Estimating trace-suspect match probabilities in forensics for singleton Y-STR haplotypes using coalescent theory
Estimation of match probabilities for singleton haplotypes of lineage markers, i.e. for haplotypes observed only once in a reference database augmented by a suspect profile, is an important problem in forensic genetics. We compared the performance of four estimators of singleton match probabilities for Y-STRs, namely the count estimator, both with and without Brenner’s so-called kappa correction, the surveying estimator, and a previously proposed, but rarely used, coalescent-based approach implemented in the BATWING software. Extensive simulation with BATWING of the underlying population history, haplotype evolution and subsequent database sampling revealed that the coalescent-based approach and Brenner’s estimator are characterized by lower bias and lower mean squared error than the other two estimators. Moreover, in contrast to the two count estimators, both the surveying and the coalescent-based approach exhibited a good correlation between the estimated and true match probabilities. However, although its overall performance is thus better than that of any other recognized method, the coalescent-based estimator is still very computation-intense. Its application in forensic practice therefore will have to be limited to small reference databases, or to isolated cases of particular interest, until more powerful algorithms for coalescent simulation have become available.

11:00 to 11:30 Morning Coffee
11:30 to 12:15 Peter Gill (University of Oslo)
Challenges of reporting complex DNA mixtures in the court-room
The introduction of complex software into routine casework is not without challenge. A number of models are available to interpret complex mixtures. Some of these are commercial, whereas others are open source. For a comprehensive validation protocol see Haned et al Sci and Justice (206) 104-108). In practice, methods are divided into quantitative or qualitative models and there is no preferred method. A number of cases have been reported in the UK using different software. For example, in R v. Fazal, the prosecution and defence used different software to analyse the same case. Different likelihood ratios were obtained and both were reported to the court – a way forward to prevent confusion in court is presented. This paper also highlights the necessity of ‘equality of arms’ when software is used, illustrated by several other cases. Examples of problematic proposition formulations that may provide misleading results are described.
12:15 to 13:00 Klaas Slooten (Vrije Universiteit Amsterdam)
The likelihood ratio as a random variable, with applications to DNA mixtures and kinship analysis
In forensic genetics, as in other areas of forensics, the data to be statistically evaluated are the result of a chance process, and hence one may conceive that they had been different, resulting in another likelihood ratio. Thus, one thinks of the obtained likelihood ratio in the case at hand as the outcome of a random variable. In this talk we will discuss the way to formalize this intuitive notion, and show general properties that the resulting distributions of the LR must have. We illustrate and apply these general results both to the evaluation of DNA mixtures and to kinship analysis, two standard applications of forensic DNA profiles. For mixtures, we discuss how model validation can be aided by investigation of the obtained likelihood ratios. For kinship analysis we observe that for any pairwise kinship comparison, the expected likelihood ratio does not depend on the allele frequencies of the loci that are used other than through the total number of alleles. We compare the behavior of the LR as a function of the allele frequencies with that of the weight of evidence, Log(LR), and argue that the WoE is better behaved. This talk is largely based on a series of three papers in International Journal of Legal Medicine co-authored with Thore Egeland.

Exclusion probabilities and likelihood ratios with applications to kinship problems, Int. J. Legal Med. 128, 2014, 415---425,
Exclusion probabilities and  likelihood ratios with applications to mixtures, Int. J. Legal Med. 130, 2016, 39---57,
The likelihood ratio as a random variable for linked markers in kinship analysis, Int. J. Legal Med. 130, 2016, 1445---1456
13:00 to 13:30 Lunch @ Wolfson Court
14:00 to 15:00 Optional Discussion Forum: Discussion Room
15:00 to 15:30 Afternoon Tea
15:30 to 16:15 Michael Sigman (University of Central Florida)
Assessing Evidentiary Value in Fire Debris Analysis


Co-author: Mary R. Williams (National Center for Forensic Science, University of Central Florida)
This presentation will examine the calculation of a likelihood ratio to assess the evidentiary value of fire debris analysis results. Models based on support vector machine (SVM), linear and quadratic discriminant analysis (LDA and QDA) and k-nearest neighbors (kNN) methods were examined for binary classification of fire debris samples as positive or negative for ignitable liquid residue (ILR). Computational mixing of data from ignitable liquid and substrate pyrolysis databases was used to generate training and cross validation samples. A second validation was performed on fire debris data from large-scale research burns, for which the ground truth (positive or negative for ILR) was assigned by an analyst with access to the gas chromatography-mass spectrometry data for the ignitable liquid used in the burn. The probabilities of class membership were calculated using an uninformative prior and a likelihood ratio was calculated from the resulting class membership probabilities . The SVM method demonstrated a high discrimination, low error rate and good calibration for the cross-validation data; however, the performance decreased significantly for the fire debris validation data, as indicated by a significant decrease in the area under the receiver operating characteristic (ROC) curve. The QDA and kNN methods showed performance trends similar to those of SVM. The LDA method gave poorer discrimination, higher error rates and slightly poorer calibration for the cross validation data; however the performance did not deteriorate for the fire debris validation data.


16:15 to 17:00 James Curran (University of Auckland)
Understanding Intra-day and Inter-day Variation in LIBS

Co-authors: Anjali Gupta (University of Auckland), Chris Triggs (Universtity of Auckland), Sally Coulson (ESR) 

LIBS (laser induced breakdown spectroscopy) is a low-cost alternative and highly portable instrument that can be used in forensic applications to determine elemental composition. It differs from more traditional instruments such as ICP-MS and $\mu$-XRF in that the output is a spectrum rather than the concentration of elements. LIBS has great appeal in forensic science but has yet to enter the mainstream. One of the reasons for this is a perceived lack of reproducibility in the measurements over successive days or weeks. In this talk I will describe a simple experiment we designed to investigate this phenomenon, and the consequences of our findings. The analysis involves both classical methodology and a simple Bayesian approach.  

Friday 11th November 2016
09:45 to 10:30 Giulia Cereda (Université de Lausanne); (Universiteit Leiden)
A Bayesian nonparametric approach for the rare type problem


Co-author: Richard Gill (Leiden University)
The evaluation of a match between the DNA profile of a stain found on a crime scene and that of a suspect (previously identified) involves the use of the unknown parameter p=(p1, p2, ...), (the ordered vector which represents the proportions of the different DNA profiles in the population of potential donors) and the names of the different DNA types.

We propose a Bayesian nonparametric method which considers p as the realization of a random variable P distributed according to the two-parameter Poisson Dirichlet and discard information about DNA types.

The ultimate goal of this model is to evaluate DNA matches in the rare type case, that is the situation in which the suspect's profile, matching the crime stain profile, is not one of those in the database of reference. This situation is so problematic that has been called “the fundamental problem of forensic mathematics” by Charles Brenner. 


10:30 to 11:00 Morning Coffee
11:00 to 11:30 Marjan Sjerps (Universiteit van Amsterdam)
Evaluating a combination of matching class features: a general 'blind' procedure and a tire marks case
Co-authors: Ivo Alberink (NFI), Reinoud Stoel (NFI)
Tire marks are an important type of forensic evidence as they are frequently encountered at crime scenes. When the tires of a suspect’s car are compared to marks found at a crime scene, the evidence can be very strong if so-called ‘acquired features’ are observed to correspond. When only ‘class characteristics’ such as parts of the profile are observed to correspond, it is obvious that many other tires will exist that correspond equally well. This evidence is, consequently, usually considered very weak or it may simply be ignored. Like Benedict et al. (2014) we argue that such evidence can still be strong and should be taken into account. We explain a method for assessing the evidential strength of a set of matching class characteristics by presenting a case example from the Netherlands in which tire marks were obtained. Only part of two different tire profiles were visible, in combination with measurements on the axes width. Suitable databases were found already existing and accessible to forensic experts. We show how such data can be used to quantify the strength of such evidence and how it can be reported. We also show how the risk of contextual bias may be minimized in cases like this. In the particular exemplar case quite strong evidence was obtained, which was accepted and used by the Dutch court. We describe a general procedure for quantifying the evidential value of an expert’s opinion of a ‘match’. This procedure can directly be applied to other types of pattern evidence such as shoeprints, fingerprints, or images. Furthermore, it is ‘blind’ in the sense that context inf ormation management (CIM) is applied to minimize bias.
11:30 to 12:00 Karen Kafadar (University of Virginia)
Quantifying Information Content in Pattern Evidence

The first step in the ACE-V process for comparing fingerprints is the "Analysis" phase, where the latent print under investigation is subjectively assessed for its "suitability" (e.g., clarity and relevance of features and minutiae). Several proposals have been offered for objectively characterizing the "quality" of a latent print. The goal of such an objective assessment is to relate the "quality metric" (which may be a vector of quality scores) to the accuracy of the call (correct ID or correct exclusion), so that latent print examiners (LPEs) can decide immediately whether to proceed with the other steps of ACE-V. We describe some of these proposals that attempt to quantify the "information content" of a latent print or of its individual features ("minutiae") and describe initial efforts aimed at assessing their association with accuracy, using first NIST's public SD27a latent fingerprint database containing prints judged by "experts" as "good," "bad," or "ugly." One proposed metric, based on gradients to determine the clarity of the minutiae, correlates well with the general classification and thus can serve as an objective, vs subjective, measure of information content. 

12:00 to 12:30 David Balding (University of Melbourne); (University College London)
Inference for complex DNA profiles

I will outline some interesting aspects of the analysis of complex DNA profiles that have arisen from my recent work in court cases and in my research.  The latter has been largely performed in association with former PhD student Christ Steele at UCL and Cellmark UK.  We developed a new statistical model and software for the evaluation of complex DNA profiles [1], and some new approaches to validation.  We also investigated using a dropin model to account for multiple very-low-level contributors, the statistical efficiency of a split-and-replicate profiling strategy versus a one-shot profile, and a simple linkage adjustment.  If time and opportunity permit I will use my last-speaker slot to comment on relevant issues raised during the conference,

  1. Steele C, Greenhalgh M, Balding D (2016). Evaluation of low-template DNA profiles using peak heights. Statistical Applications in Genetics and Molecular Biology, 15(5), pp. 431-445. doi:10.1515/sagmb-2016-0038




12:30 to 13:30 Lunch @ Wolfson Court
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons