Spatial capture–recapture (SCR) methods have become widely applied in ecology. The immediate adoption of SCR is due to the fact that it resolves some major criticisms of traditional capture–recapture methods related to heterogeneity in detectabililty, and the emergence of new technologies (e.g. camera traps, non‐invasive genetics) that have vastly improved our ability to collection spatially explicit observation data on individuals. However, the utility of SCR methods reaches far beyond simply convenience and data availability. SCR presents a formal statistical framework that can be used to test explicit hypotheses about core elements of population and landscape ecology, and has profound implications for how we study animal populations. In this software note, we describe the technical basis and analytical workflow of oSCR, an R package for analyzing spatial encounter history data using a multi‐session sex‐structured likelihood. The impetus for developing oSCR was to create an accessible and transparent analysis tool that allows users to conveniently and intuitively formulate statistical models that map directly to fundamental processes of interest in spatial population ecology (e.g. space use, resource selection, density and connectivity). We have placed an emphasis on creating a transparent and accessible code base that is coupled with a logical workflow that we hope stimulates active participation in further technical developments.

Spatial capture–recapture (SCR) was originally developed as an extension of traditional (i.e. non‐spatial) mark–recapture to accommodate live trapping of animals (Efford ) but is now acknowledged as a profound advancement in spatial statistical modeling of animal populations (Borchers et al. 2016, Royle et al. 2018). SCR represents a unifying framework for investigating many aspects of spatial population ecology using data collected by the repeated encountering or capturing of individual animals (i.e. encounter histories) across space. With over a decade of development and application, ecologists have used SCR to examine a variety of topics including landscape and network connectivity, demography, resource selection and movement and dispersal (reviewed by Royle et al. 2018).

As with other hierarchical models in ecology (Royle and Dorazio ), the formal linkage between ecological processes and observational processes in an SCR model has allowed for better inferences on the former and more robust accommodation of the latter. In particular, SCR considers the collection of individuals in a population as a latent spatial point process of activity centers represented by coordinates distributed within some region of interest. A probability model for the activity centers can be used to estimate variation in density across space as it relates to environmental attributes or summarized to provide the density/abundance for a given area and time frame. Conditional on the latent point process, the probability of observing or encountering an individual at a sampling device is a function of the distance between the individual's activity center and the location of the sampling device. This distance function approximates the average scale of individual movement about an activity center and is often associated with home range features (Royle et al. 2014). By adding spatial context to the ecological and observational processes, SCR overcomes several deficiencies of traditional mark–recapture related to defining the sampling region and it expands the scope of potential inferences about both individual and population level attributes. Any survey methods that produce spatial encounters of individuals can provide the necessary data for fitting SCR models and exploring ecological hypotheses about the spatial structure of animal populations.

The rapid adoption of SCR has been supported by new technologies that ease the collection of spatial encounter information and by analytical resources that make model fitting for such data more accessible to a wider audience. Non‐invasive sampling techniques where individuals can be encountered and identities determined without physical capture allow researchers to efficiently sample large landscapes, particularly useful for wide‐ranging species (Long ). The most common techniques are camera trapping, in which remote cameras provide individual encounters for species with unique natural markings; and non‐invasive genetic sampling (NGS), where genetic material collected through active (e.g. hair snare) or passive (e.g. scat) methods yields DNA for identification of individuals for any species. The collection of marks, whether by photograph or genotype, can occur at fixed locations in space or by search–encounter approaches where observers use systematic spatial surveys and record unique locations of individual encounters. This flexibility in data collection enables the application of SCR methods to population estimation for a wide range of species.

Regardless of the technique used to collect spatial encounter histories, many variations of an SCR model can be fit in freely available software programs across a spectrum of user proficiency and potential for customization. Programs like DENSITY (Efford et al. ) and its R package descendent secr (Efford ) are designed to allow both novice and advanced users to fit a wide collection of SCR models using maximum likelihood estimation, though custom specifications are not possible. Intermediate users can write custom SCR likelihood specifications with the BUGS language (Lunn et al. 2009) to estimate parameters with a Bayesian approach by Markov chain Monte Carlo (MCMC), as first illustrated in Royle and Young (). A major advantage of the BUGS approach is the ease with which users can develop their own code (Kéry 2010) or adapt code already published (Royle et al. 2014). A disadvantage is that complex BUGS models can be notoriously slow, limiting the ability to easily explore multiple model structures or accommodate complex spatio–temporal dynamics. It is noteworthy, however, that these challenges have been addressed somewhat with the recent development of NIMBLE (de Valpine et al. 2017). Advanced users can freely construct custom code to address specific modeling problems, though high specificity can result in limited utility beyond the original purpose (and, thus, low adoption/application by other users). The tremendous benefit of open source software is that some initial efforts at generality can provide a framework upon which additional optional complexity can be built by interested collaborators, facilitated by community‐shared development platforms (e.g. GitHub). With this notion in mind, we present the oSCR package for fitting SCR models in R.

We started developing the oSCR project in 2014 to achieve two key objectives: First, we wanted a platform for fitting basic SCR models that accommodated non‐Euclidean distance models in the form of least‐cost path (Royle et al. 2013b, Sutherland et al. 2014) and handled full sex‐specificity using a full likelihood approach in which fully or partially observed sex data are included in the likelihood (Royle et al. 2015, Fuller et al. 2016). Second, we required a fully open and accessible implementation of the package so that a user moderately proficient in computing and statistics could understand, verify, and extend its capabilities and ensure timely development to meet rapidly developing analytical needs. Thus, oSCR is implemented completely in native R, meaning that anyone proficient in R can contribute to the project. From a conceptual standpoint, we aimed to provide a platform for using individual encounter history data to address the three core elements of spatial population ecology (Royle et al. 2018): 1) modeling spatial variation in density using inhomogeneous Poisson point process models (Borchers and Efford ); 2) modeling landscape connectivity using a model which allows for effective distance to be parameterized in terms of landscape structure, with the parameters of that ‘cost‐distance’ model estimated from observed data; and 3) integration of explicit resource selection models and telemetry data into spatial capture–recapture models (Royle et al. 2015, Linden et al. 2018). These concepts define the core functionality of oSCR. In this note, we describe the technical basis and analytical workflow of oSCR (Table 1, 2).

The modeling framework implemented in the oSCR package is that of a multi‐session sex‐structured spatial capture–recapture (MSSS‐SCR) model. The two key features of the construction of the MSSS‐SCR likelihood are (a) it accommodates sex structure in the form of a categorical individual covariate (Royle et al. 2015) and an additional parameter which is the population probability that an individual belongs to the baseline group (e.g. ‘female’), and (b) it accommodates discrete groups or sessions where *N** _{g}* is the population size of group

SCR models assume that a population of *N* individuals each have an associated spatial location which represents the home range or activity center of the individual, represented by the geographic coordinates **s*** _{i}* = (

where *N* or the expected value *E*(*D*) = *N*/*A* where *D* is density and *A* is the area of the state space. One important generalization of this model allows for the possibility that density depends on one or more spatially referenced covariates. For example, if *z*(**s**) is a covariate (e.g. habitat structure) at location *s*, then the inhomogeneous point process model posits that

where the parameter β_{1}, to be estimated, affects the relative density of points along a gradient of *z*(**s**).

The spatial point process describes the spatial structure of the population to be sampled. With the introduction of this point process, the capture–recapture framework can be extended directly to include a model for the probability of detection of an individual. The probability of detection at trap **x*** _{j}* for

This model introduces two key parameters in the SCR model: the baseline encounter probability *p*_{0}, which is the probability that an individual with activity center **s** is detected at location **x** = **s**, and the parameter σ, which controls the rate of decrease in detection probability as a function of the distance between **x** and **s**. Intuitively, σ relates to the extent of space used by an individual of the species under study. Many functional forms of the detection probability model are possible (Efford 2018), and they may involve different parameters that have slightly different meanings than *p*_{0} and σ of the half‐normal model, but these details are minor aspects of the SCR modeling framework. oSCR implements two detection functions – the half‐normal model and the two‐parameter hazard model where σ is, again, the spatial scale parameter but *p*_{0} is the detection rate, as well as variations of these that involve non‐Euclidean distance. Others could be easily added by the user if required.

Maximum likelihood estimation in SCR models is based on the marginal likelihood in which we remove the latent variable **s** from the conditional‐on‐**s** likelihood by averaging (or marginalizing) over the possible values of **s**. Therefore, we first identify the conditional‐on‐**s** likelihood. The model for encounter observation *y** _{ijk}* for individual

where we have indicated the dependence of encounter probability, *p*, on **s** and parameters θ = (*p*_{0}, σ) explicitly. The joint distribution of the data for individual *i* is the product of *J* × *K* such terms (i.e. contributions from each of *J* traps and *K* occasions):

We note this assumes that encounter of individual *i* in each trap is independent of encounters in every other trap, conditional on **s*** _{i}*. This is the fundamental property of models implemented in oSCR. The marginal likelihood is computed by removing

In the default SCR model, we assume the activity centers **s** are uniformly distributed throughout the state space, which is to say: *u* = 1, 2, …, *G* index a grid of *G* points, **s*** _{u}*, where the area of grid cells is constant. In this case, the marginal pmf of

The joint likelihood for the data from *n* observed individuals, assuming independence of encounters among and between individuals, is the product of *n* such terms and, in addition, a contribution of the *n*_{0} = *N* − *n* uncaptured individuals. Obviously each of these all‐zero encounter histories will have the same marginal pmf contribution in the likelihood given above, and we denote that marginal probability by π_{0}. The only question is, how many? We therefore include the number of such all‐zero encounter histories (that is, the number of individuals not encountered) as an unknown parameter of the model. With *n*_{0} unknown, we have to be sure to include a combinatorial term to account for the fact that, of the *n* observed individuals, there are *n*. Therefore, the joint likelihood has this form:

This is discussed in Borchers and Efford () as the conditional‐on‐*N* form of the likelihood, and referred to by Royle et al. (2013a) as the ‘binomial‐form’ due to its origin as arising from an assumption that *n* is a binomial outcome from a population of size *N* (and also it is ‘binomial looking’).

The binomial‐form of the likelihood just described can be analyzed directly in oSCR (option DorN=="N") and, in this case, log(*n*_{0}) is estimated along with parameters *p*_{0} and σ and other covariates of the models as described in Table 2. Important variations of this model accommodated in oSCR modify the binomial observation model to allow for multiple detections during an occasion – a Poisson observation model (encmod="P"), and a multinomial or multi‐catch observation model (encmod="M"). For these observation models, there are no additional considerations in building the likelihood, only in replacing the Bernoulli observation model with an appropriate alternative.

For likelihood analysis it is convenient to remove *N* from the likelihood by putting a Poisson prior distribution on it. That is, we assume *D* is the density of activity centers. Under this prior, we can remove *N* from the binomial‐form of the likelihood, again by summation over the possible values of *N*. We call this the Poisson‐integrated likelihood which is implemented in oSCR (as the default with option DorN="D") and also is the default likelihood form in secr (Efford 2018). To compute the Poisson‐integrated likelihood we do a further level of marginalization over the Poisson prior distribution:

where

We emphasize there are two marginalizations involved in the formation of this likelihood: 1) the integration to remove the latent variables **s**; and, 2) summation to remove the parameter *N*.

A more general expression for the Poisson‐integrated likelihood recognizes that the Poisson assumption for *N* implies that, on a discrete state space, the number of activity centers in a state‐space pixel, *N*(**s**), is also Poisson with mean *D*(**s**)*A*(**s**) where *A*(**s**) is the area of the state‐space pixel having center coordinate **s** and *D*(**s**) is the pixel density. This is the property of compound additivity which defines a Poisson point process and allows for specifying explicit models for the distribution of individuals across homogeneous or heterogeneous landscapes. In particular, if we suppose that *D*(**s**) depends on measurable covariates at the scale of the state‐space resolution (i.e. pixel size), e.g.

then the conditional intensity function for the *n* observed activity centers can be expressed as:

Then, the marginalization operation of each observed encounter history is

and

These considerations of formulating the binomial‐form of the likelihood and then translating it to the Poisson‐integrated form define a general class of SCR models which can accommodate Bernoulli, Poisson or multinomial encounters and allow for the modeling of covariates on the various elemental parameters *p*_{0}, σ and *D*(**s**). The scope of these models is described in Table 2. We now define two additional important structural elements that define the core likelihood structure implemented in oSCR.

In most studies we record some information about the sex of individuals which might be used in developing sex‐structured models in which one or more parameters depend on sex. Moreover, it is common in practice to have some missing values of sex due to DNA samples not amplifying, poor photo quality, or other considerations. To build models with sex specificity of parameters, we require a different form of the likelihood which accommodates observed sex data; such data are informative about the sex ratio of the population and, hence, relative densities of males and females (Royle et al. 2015). This new form of the likelihood is used by default if sex information is included in the core oSCR data object, the scrFrame, and thus, all models fit using the same data object have comparable likelihoods, and AIC‐based model comparisons can be used. The form of sex‐structured model described here is implemented in the oSCR package.

In general, let there be *n*_{1} captured individuals with sex data, *n*_{2} captured individuals with missing values, and *n*_{0} = *N* − *n*_{1} − *n*_{2} undetected individuals. Let *c** _{i}* indicate the sex of individual

There are *n*_{2} such contributions to the likelihood. In addition, each of the individuals with observed sex contribute a component of this form:

where *n* = *n*_{1} + *n*_{2} and π_{0} is the marginal probability of the ‘all‐zero’ encounter history, averaged over all possible groups. Remember, for the SCR model, the elemental part of the likelihood is the marginal likelihood of an individual's encounter history, averaged over possible values of **s**. For encounter histories which have missing sex data, we then have to do an additional marginalization over possible values of the sex variable producing the [**y*** _{i}*|θ] term above.

A common situation in many studies is to collect data on distinct more‐or‐less independent populations. For example, a camera trapping study might take place in multiple reserves or use multiple camera trap grids spaced far apart. Or sampling for amphibians using artificial cover objects (ACOs) involves replicated ACO arrays in different habitats (Sutherland et al. 2016, Schmidt et al. 2017). Replication might occur over time as well – sampling in different seasons or years. It is important to be able to combine the data from such replicated studies into a single statistical model in order to increase in statistical power, and also to test effects which might represent important contrasts among the sampled populations. In practice, these populations are called ‘sessions’, groups, or strata and the model which integrates the data from different sessions is referred to as a multi‐session model. The Poisson‐integrated likelihood previously introduced applies directly to this multi‐session structure. Let *N** _{g}* be the population size for group

where the *N** _{g}* are mutually independent random variables. The basic likelihood described above can then be computed by independently marginalizing over this prior distribution for

and then the data from all groups are simultaneously used to estimate parameters of the SCR detection model and the information among groups is used to estimate the parameters describing variation in λ* _{g}*.

A key assumption of the multi‐session model is independence of the observed encounter histories among groups and also independence of the population size random variables *N** _{g}*. One implication of this is that it precludes individuals from appearing in multiple groups, which might happen if the trapping arrays are close together or if the groups represent the same population sampled over time. The effect of ignoring this non‐independence should not produce bias in parameter estimates, but will likely lead to an over‐statement of precision. Simulations would be required to understand the magnitude by which precision is overstated. In addition, the independence assumption is probably invalid if some of the group structure represents sub‐groups within populations. For example, male and female sub‐populations of multiple groups, or age classes, or species sub‐groups. In such cases, we might expect population size of these sub‐groups to covary.

Using a least cost path approach, SCR allows the estimation of one or more resistance parameters, δ, that quantify how movement is influenced by local landscape structure. The model replaces Euclidean distance with ‘ecological’ distance (Royle et al. 2013b) and therefore relaxes the restrictive assumptions of symmetrical and stationary space use regardless of the surrounding habitat (Sutherland et al. 2014, 2018, Fuller et al. 2016). Estimating parameters of the cost function requires specification of a fourth model in the oSCR model call, and a second state–space‐like spatially referenced data frame, a costDF, that has the coordinates of the pixel centroids and associated pixel specific covariates (Table 2, code supplement in Sutherland et al. 2018).

The integration of telemetry data can allow for improved parameter estimation by informing the σ parameter and/or supporting a resource selection function (RSF) as part of the encounter model (Royle et al. 2013a, Tenan et al. 2017, Linden et al. 2018). Under the assumption that individual encounters at traps are a ‘thinned’ version of telemetry fixes (and that telemetered individuals are representative of the population targeted by trapping), certain parameters can be shared in the likelihoods for each data source. The original model formulation by Royle et al. (2013a) involved independent data, such that telemetered individuals did not appear in the spatial encounter histories generated by the trap sampling. Linden et al. (2018) described the marginal likelihood for situations where telemetered individuals were also encountered at traps, allowing each data source to be informative about the latent activity centers for relevant individuals. oSCR allows both independent and dependent telemetry data and various options for the degree of telemetry integration. In all cases, fixes need to be summarized as pixel counts within the state space (with functions provided to achieve this) and RSF integration requires an additional data frame, rsfDF, similar to the costDF, with a resolution matching the state space (Table 2, code available in Linden et al. 2018). We note that, currently, oSCR cannot model serial correlation in telemetry fixes and assumes that fix locations are independent across time.

Here we outline a typical oSCR analysis and pair the workflow with an example using redbacked salamander (RBS) data (Sutherland et al. 2016). A complete annotated workflow is provided in the Supplementary vignette Appendix 1, and here we highlight the data objects and functions required to conduct an SCR analysis in oSCR (Table 1).

Every spatial capture–recapture study gives rise to at least two data structures around which we have built the data processing step of the oSCR workflow. The first data object is a ‘Trap Deployment File’ (TDF) containing, at minimum, the name and coordinates of each trap or detector. In addition, the TDF can contain trap‐by‐occasion binary operation data (1 = operational, 0 = not operational), and, if available, trap‐specific covariates that may or may not vary by occasion. If all detectors are operational in all occasions, there is no need to supply trap operation data. Trap characteristics (name, location and operation data) and covariates are separated by a column of ‘\’‘s’. Because the number of occasions can vary between sessions, it is natural to have separate TDFs for each session which is required for oSCR.

The second data source is the ‘Encounter Data File’ (EDF) which contains the individual encounter history data. This has, at a minimum, columns containing the unique individual identifier, the name of the trap in which it was detected, the occasion during which the encounter occurred, and a numeric session identifier. If the sex of the species can be determined, this information can also be included as an additional column which may include missing values. Unlike the TDF, and because there is a session identifier, encounter data from all sessions are included in a single EDF. The names of the traps in the EDF must match the names in the associated TDFs. The sex column should contain only 'M' for males, 'F' for females, and an unknown identifier (we use 'U') for unknowns.

Once these objects have been read into R, the oSCR workflow begins with the creation of an scrFrame, a data object that links the EDF and TDFs, checks for errors, and formats the data for analysis. The scrFrame is created using the helper function data2oscr() and contains the following data objects:

caphist – session‐specific list of individual‐by‐trap‐by‐occasion encounter history arrays. Array can be binary or contain capture frequencies

traps – session‐specific list of trap coordinates

trapCovs – session‐specific list of trap‐by‐covariate data frames containing trap specific covariate values

trapOperation – session‐specific trap‐by‐occasion binary matrix denoting whether traps were operational

indCovs – session‐specific list of individual covariates. These data frames contain the sex of the individual if sex information is provided

sigCovs – session‐by‐covariate data frame of covariates

Typing the scrFrame object name produces summaries of the trapping effort and detections by session (number of individuals: n individuals, traps: n traps, and occasions: n occasions), and of the capture histories (average number of times an individual is encounters: avg caps, average number of spatial locations individuals are encountered at: avg spatial caps, the mean maximum distance moved: mmdm, and the mmdm pooled across sessions). The spatial encounter history data can also be summarized graphically using plot(scrFrame) (Fig. 1).

A key, but often under‐appreciated, part of an SCR analysis is the definition of the state space (

The RBS study system is a set of four independent artificial cover board arrays set in a small woodland in Ithaca, New York. Each array consists of 50 small (25 × 25 cm) cover boards (detectors) arranged in a 5 × 10 m rectangle with each board separated by 1 m (Fig. 1, 2, Snippet 1). Because each array is independent we define each as an independent ‘session’ (session 1–4). Arrays were visited 7, 5, 6 and 4 times, respectively, from September to November in 2014 (Snippet 1). During each visit (or occasion), every board was checked, and unmarked salamanders were given unique individual marks by injecting visual implant elastomer (Grant ). Individual spatial encounter histories were generated from the collection of the location, occasion, and identity of every individual detected. See Sutherland et al. (2016) for a full description.

With the scrFrame and the ssDF created, we can move to model fitting (step 2, Table 1). The main fitting function in oSCR is oSCR.fit(). The fitting function requires, at minimum, an scrFrame, an ssDF, and a model list that specifies the model structure in standard R model formula syntax for each of the three model components (Table 1). The list contains, in this order, a model formula for density (D~), baseline detection (p~), and the spatial scale of detection (sig~). An example of three RBS models is shown in Snippet 2. The base model has constant density and space use, and, based on what we know about RBS activity patterns, a detection model with a quadratic effect of day of year and a behavioural response. The full model has session‐specific density, session‐ and sex‐specific space use, and, in addition to a quadratic effect and behavioural response, detection varying by session and sex. For the remainder of this example, we use the model with the lowest AIC as the top model (Supplementary vignette Appendix 1).

The function oSCR.fit() has many other arguments that can be used to control the nature of the model fitting routine including, but not limited to: a choice of binomial (encmod="B"), Poisson (encmod="P") or multinomial (encmod="M") encounter models (Royle et al. 2014); asymmetric space use models (Royle et al. 2013b, Sutherland et al. 2014); resource selection functions (Royle et al. 2013a); integration of telemetry data (Tenan et al. 2017, Linden et al. 2018); and ‘local‐to‐capture’ likelihood evaluation (Milleret et al. 2018).

It is common in observational studies in ecology to develop and compare multiple competing models, typically using AIC. This is possible in oSCR in what we call the ‘model selection‘ stage of the workflow. In order to formally compare models, a list of fitted models must be created using the function fitList.oSCR(). Ranked AIC model tables can be produced by passing the fitList.oSCR object to the modSel.oSCR() function. The model selection function returns three objects of interest: a model table ranked by AIC (lowest first), a model × parameter matrix of maximum likelihood estimates, and a model × parameter matrix of standard errors. While the model table is perhaps of most interest, the coefficient and standard error matrices are useful for model averaging. Currently there is no helper function for generating model averaged predictions, model averaged coefficients can be computed using ma.coef().

Finally, making predictions from specific models can be done using get.real(), a function that is similar to the predict() functions used on unmarked or glm model objects. The function takes the model object, an argument identifying which model component to predict from, e.g. density (type=‘dens’), detection (type=‘det’), or sigma (type=‘sig’), and a newdata object that is a data frame used for making predictions. The newdata object should contain columns for all covariates in the model component, including a sex column if sex is included in the scrFrame. Figure 3 shows the predicted relationships for detection (top), density (middle row), and space use (bottom row) based on the ‘top’ RBS model (Supplementary vignette Appendix 1).

In this software note, we have outlined the main features and functionality of the R package oSCR. We emphasize the accessible and intuitive nature of the code base and workflow and hope this stimulates active participation in further technical developments. We do, however, note that oSCR represents one of several options for analyzing various classes of spatial capture–recapture models, each offering unique capabilities. For example: secr, a comprehensive general purpose package for fitting SCR models (Efford ); ascr, a package for analyzing spatial encounters generated from acoustic data (Stevenson 2018); openpopscr, a package for fitting open population SCR models by maximum likelihood (Glennie et al. ); and its Bayesian analogue OPenPopSCR (Augustine ). We have attempted, as much as possible, to maintain consistency in data formatting and terminology with the suite of available software options to avoid confusion, and importantly, to ensure complementarity.

oSCR therefore represents an accessible and intuitive framework, and accompanying workflow, for analyzing spatial encounter history data that places a specific focus the inherent structure that exists both within populations (e.g. sex, or any other binary class) and between populations that have some level of spatial or temporal independence (e.g. session). Importantly, oSCR is set up naturally to quantify the importance of such structure while simultaneously estimating density, detection and space use.

oSCR is free and open source, available on GitHub (<

Sutherland, C., Royle, J. A. and Linden, D. W. . ‘oSCR’. – Ecography XX: XXX–XXX (ver. X.X).

*Acknowledgments* – This work was supported by the New York State Department of Environmental Conservation and the Hudson River Natural Resource Trustees. The conclusions and opinions presented here are those of the authors and do not represent the official position of the New York State Department of Environmental Conservation, or the Hudson River Natural Resource Trustees. We also thank Dana Morin and Olivier Gimenez for very thoughtful reviews of the manuscript. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Supplementary material (available online as Appendix ecog‐04551 at <