R
package
bangarang
The bangarang
package is a bundle of datasets and
functions that help analysts with research in the Kitimat Fjord System
(KFS) in the north coast of mainland British Columbia. Nearly all of the
content is catered specifically to analysis of datasets from the RV
Bangarang expedition of 2013 - 2015, which was focused on the
abundance, distribution, and foraging ecology of whales, seabirds,
salmon, and their prey during the months of summer and early fall. The
research involved line-transect sampling, active acoustic (echosounder)
surveys, and oceanographic sampling (both while underway and at a grid
of stations).
Methodological details can be found here. More info on the Bangarang project can be found here. The Bangarang project was carried out as a doctoral thesis at Scripps Institution of Oceanography in close collaboration with the Gitga’at First Nation, BC Whales, Fisheries & Oceans Canada, and the NOAA Southwest Fisheries Science Center.
The bangarang
package can be downloaded directly from
GitHub
:
# Install devtools if needed
if (!require('devtools')) install.packages('devtools')
# Increase timeout for download, since there are datasets
options(timeout=9999999)
# Install package
devtools::install_github('ericmkeen/bangarang')
Load into your R
session:
library(bangarang)
## Warning: replacing previous import 'data.table::first' by 'dplyr::first' when
## loading 'bangarang'
## Warning: replacing previous import 'data.table::last' by 'dplyr::last' when
## loading 'bangarang'
## Warning: replacing previous import 'data.table::between' by 'dplyr::between'
## when loading 'bangarang'
## Warning: replacing previous import 'scales::rescale' by
## 'spatstat.geom::rescale' when loading 'bangarang'
## Warning: replacing previous import 'data.table::shift' by
## 'spatstat.geom::shift' when loading 'bangarang'
## Warning: replacing previous import 'scales::viridis_pal' by
## 'viridis::viridis_pal' when loading 'bangarang'
This vignette was made with bangarang
version 3.16, and
will make use of a few other packages:
library(dplyr)
library(tidyr)
library(ggplot2)
Produce a map of the Bangarang study area in the KFS using
the ggplot
and sf
packages:
gg_kfs()
As with all the bangarang
functions, see this function’s
documentation for changing the geographic range, color, and transparency
settings.
?gg_kfs
data(kfs_land)
kfs_land %>% class
## [1] "SpatialPolygons"
## attr(,"package")
## [1] "sp"
kfs_land %>% glimpse
## Formal class 'SpatialPolygons' [package "sp"] with 4 slots
## ..@ polygons :List of 1
## .. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
## ..@ plotOrder : int 1
## ..@ bbox : num [1:2, 1:2] -129.9 52.7 -127.5 54.1
## .. ..- attr(*, "dimnames")=List of 2
## ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
## ..$ comment: chr "TRUE"
Snapshot of dataset:
par(mar=c(.1,.1,.1,.1))
kfs_land %>% plot
data(kfs_seafloor)
kfs_seafloor %>% glimpse
## Rows: 375,178
## Columns: 3
## $ x <dbl> -129.6300, -129.6292, -129.6283, -129.6275, -129.6267, -129.6258…
## $ y <dbl> 53.55, 53.55, 53.55, 53.55, 53.55, 53.55, 53.55, 53.55, 53.55, 5…
## $ layer <dbl> -0.4261364, -70.2601395, -67.4961166, -85.6188736, -99.3454132, …
Plot it:
ggplot(kfs_seafloor,
aes(x=x, y=y, color=layer)) +
geom_point(size=.1) +
xlab(NULL) + ylab(NULL) + labs(color = 'Depth (m)') +
theme_minimal()
data(shiplane)
shiplane %>% glimpse
## Rows: 137
## Columns: 4
## $ PID <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ POS <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,…
## $ X <dbl> -128.6883, -128.6832, -128.6903, -128.6990, -128.7111, -128.7373, …
## $ Y <dbl> 53.99415, 53.97649, 53.95486, 53.93647, 53.91967, 53.89999, 53.879…
Plot it:
gg_kfs() +
geom_path(data=shiplane,
mapping=aes(x=X, y=Y, group=PID)) +
xlab(NULL) + ylab(NULL)
data(provinces)
provinces %>% glimpse
## Rows: 4,458
## Columns: 5
## $ PID <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ POS <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
## $ X <dbl> -129.2988, -129.1944, -129.1999, -129.1978, -129.1965, -129.2…
## $ Y <dbl> 52.97428, 52.97387, 52.98048, 52.98586, 52.99082, 52.99371, 5…
## $ province <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
Check it out:
gg_kfs() +
geom_polygon(data=provinces,
mapping=aes(x=X,
y=Y,
group = factor(province),
fill = factor(province),
color = factor(province)),
alpha=.4) +
xlab(NULL) + ylab(NULL) + labs(fill='Province', color='Province')
data(channels)
channels %>% glimpse
## Rows: 4,811
## Columns: 5
## $ PID <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ POS <int> 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 5…
## $ X <dbl> -129.3084, -129.1936, -129.1912, -129.1887, -129.1886, -129.1…
## $ Y <dbl> 52.96084, 52.92536, 52.92495, 52.92681, 52.93126, 52.93364, 5…
## $ province <chr> "caa", "caa", "caa", "caa", "caa", "caa", "caa", "caa", "caa"…
Check it out:
gg_kfs() +
geom_polygon(data=channels,
mapping=aes(x=X,
y=Y,
fill = province,
color = province),
alpha=.4) +
xlab(NULL) + ylab(NULL)
data(kfs_blocks_bbox)
kfs_blocks_bbox %>% glimpse
## Rows: 26
## Columns: 5
## $ id <chr> "CAAS", "CAAC", "CAAN", "ESTS", "ESTC", "ESTN", "CMPS", "CMPC",…
## $ left <dbl> -129.3070, -129.5499, -129.5500, -129.6373, -129.7473, -129.761…
## $ right <dbl> -129.0806, -129.3070, -129.3127, -129.4416, -129.5395, -129.540…
## $ bottom <dbl> 52.76424, 52.77458, 52.95501, 53.04410, 53.11001, 53.17002, 52.…
## $ top <dbl> 52.96012, 52.95498, 53.04409, 53.10999, 53.16999, 53.22782, 52.…
Check it out:
gg_kfs() +
geom_rect(data=kfs_blocks_bbox,
mapping=aes(xmin=left,
xmax=right,
ymin=bottom,
ymax=top,
group=id),
fill=NA,
color='black') +
xlab(NULL) + ylab(NULL)
data(blocks)
blocks %>% glimpse
## Rows: 54
## Columns: 11
## $ ID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
## $ name <chr> "Parker Pass", "Central Loredo", "North Loredo", "West Rennis…
## $ sq.km <dbl> 78.59, 44.82, 45.17, 38.56, 15.63, 97.74, 12.16, 51.55, 28.96…
## $ left <dbl> -129.4811, -129.2744, -129.2737, -129.4814, -129.3482, -129.6…
## $ right <dbl> -129.2744, -129.0512, -129.1175, -129.3482, -129.2741, -129.4…
## $ top <dbl> 52.81542, 52.81563, 52.85918, 52.85939, 52.85939, 52.91180, 5…
## $ bottom <dbl> 52.75271, 52.75271, 52.81563, 52.81563, 52.81542, 52.81563, 5…
## $ width <dbl> 0.20668030, 0.22315979, 0.15621185, 0.13320923, 0.07415771, 0…
## $ height <dbl> 0.06270989, 0.06291739, 0.04355275, 0.04376004, 0.04396754, 0…
## $ center.x <dbl> -129.3777, -129.1628, -129.1956, -129.4148, -129.3111, -129.5…
## $ center.y <dbl> 52.78407, 52.78417, 52.83740, 52.83751, 52.83740, 52.86371, 5…
Check it out:
gg_kfs() +
geom_rect(data=blocks,
mapping=aes(xmin=left,
xmax=right,
ymin=bottom,
ymax=top,
group=name),
fill=NA,
color='black') +
xlab(NULL) + ylab(NULL)
data(stations)
stations %>% glimpse
## Rows: 106
## Columns: 11
## $ X <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
## $ block <chr> "CAA", "CAA", "CAA", "CAA", "CAA", "CAA", "CAA", "CAA", "CAA…
## $ sta <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 1, 2, 3, 4, 5, 6,…
## $ mode <chr> "stout", "stout", "lite", "stout", "stout", "stout", "lite",…
## $ lat <dbl> 52.89065, 52.86761, 52.84456, 52.85030, 52.85604, 52.88457, …
## $ long <dbl> -129.1702, -129.2017, -129.2332, -129.2840, -129.3348, -129.…
## $ color <chr> "black", "black", "blue", "black", "black", "black", "blue",…
## $ dist2next <dbl> 6.413930, 6.413930, 12.404973, 12.404973, 5.399255, 5.399255…
## $ tlen <dbl> 3.206965, 3.206965, 6.202486, 6.202486, 2.699627, 2.699627, …
## $ dist.nm <dbl> 2.063725, 2.064170, 2.155478, 2.155202, 2.063981, 2.063863, …
## $ dist.km <dbl> 3.822019, 3.822842, 3.991944, 3.991435, 3.822493, 3.822274, …
Check it out:
gg_kfs() +
geom_point(data=stations,
mapping=aes(x=long,
y=lat,
group = block),
alpha=.8) +
xlab(NULL) + ylab(NULL) + labs(group='Waterway')
All survey effort aboard the Bangarang:
data(effort)
effort %>% glimpse
## Rows: 25,526
## Columns: 27
## $ date <dttm> 2013-07-06 07:21:38, 2013-07-06 07:22:01, 2013-07-06 07:…
## $ lat <dbl> 53.38721, 53.38750, 53.38818, 53.38875, 53.38947, 53.3902…
## $ lon <dbl> -129.1697, -129.1698, -129.1700, -129.1703, -129.1703, -1…
## $ knots <dbl> 2.7, 2.5, 2.4, 2.2, 2.5, 4.3, 2.8, 3.2, 1.7, 1.3, 0.7, 0.…
## $ hdg <dbl> NA, 355.1, 325.5, NA, NA, NA, NA, 347.7, 342.2, 335.8, 34…
## $ circuit <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ effort <chr> "casual", "casual", "casual", "casual", "casual", "casual…
## $ echo <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ obs1 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ pos1 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ obs2 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ pos2 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ obs3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ pos3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ bft <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ perc_cloud <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ vis <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ precip <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ glare_L <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ glare_R <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wind_hdg_raw <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wind_kph_raw <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ air_temp <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ ss_temp <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ ss_sal <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ seconds <dbl> 23, 59, 61, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 6…
## $ km <dbl> 0.03287696, 0.07681844, 0.06633145, 0.08024583, 0.0842948…
Map overview:
gg_kfs() +
geom_point(data=effort,
mapping=aes(x=lon,
y=lat,
color = effort),
alpha=.4,
size=.2) +
xlab(NULL) + ylab(NULL)
Show each year separately:
gg_kfs() +
geom_point(data=effort,
mapping=aes(x=lon,
y=lat,
color = effort),
alpha=.4,
size=.2) +
facet_wrap(~lubridate::year(date)) +
xlab(NULL) + ylab(NULL) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
Show each circuit in 2015 separately, systematic transect effort only:
# Filter to transect effort only
transects <-
effort %>%
filter(lubridate::year(date) == 2015,
effort == 'transect') %>%
mutate(group = paste0(lubridate::year(date), ' circuit ', circuit))
# plot it
gg_kfs() +
geom_point(data= transects,
mapping=aes(x=lon,
y=lat),
alpha=.4,
size=.2) +
facet_wrap(~group) +
xlab(NULL) + ylab(NULL) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
data(seabirds)
seabirds %>% glimpse
## Rows: 3,668
## Columns: 30
## $ datetime <dttm> 2013-07-06 09:16:18, 2013-07-06 09:19:43, 2013-07-06 09:…
## $ lat <dbl> 53.38375, 53.38587, 53.38589, 53.40532, 53.40536, 53.4053…
## $ lon <dbl> -129.1378, -129.1364, -129.1364, -129.0073, -129.0074, -1…
## $ effort <chr> "transect", "transect", "transect", "transect", "transect…
## $ bft <dbl> 0, 0, 0, 0, 0, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
## $ precip <chr> "clear", "clear", "clear", "clear", "clear", "clear", "cl…
## $ knots <chr> NA, NA, NA, NA, NA, "4", "4", "4", "4", NA, NA, NA, NA, N…
## $ hdg <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wind_kph_raw <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ wind_hdg_raw <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ zone <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ line <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ side <chr> "HELM", NA, "PORT", "STAR", NA, "HELM", NA, "PORT", "PORT…
## $ best <dbl> 2, 1, 1, 2, 2, 6, 1, 1, 2, 2, 2, 2, 4, 1, 2, 1, 2, 1, 1, …
## $ min <dbl> 2, 1, 1, 2, 2, 6, 1, 1, 2, 2, 2, 2, 4, 1, 2, 1, 2, 1, 1, …
## $ max <dbl> 2, 1, 1, 2, 2, 6, 1, 1, 2, 2, 2, 2, 4, 1, 2, 1, 2, 1, 1, …
## $ feed <chr> "N", "Y", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N…
## $ motion <chr> "FLY", "SIT", "FLY", "FLY", "SIT", "FLY", "RAFT", "FLY", …
## $ dir <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ height <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ sp1 <chr> "PIGU", "MAMU", "WHCG", "MAMU", "MAMU", "MAMU", "WEGU", "…
## $ per1 <dbl> 100, 100, 100, 100, 100, 100, 100, 0, 0, 0, 0, 0, 0, 0, 0…
## $ plum1 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ sp2 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ per2 <dbl> 0, 0, 0, 0, 0, 0, 0, 100, 100, 100, 100, 100, 100, 100, 1…
## $ plum2 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ sp3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ per3 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ plum3 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
## $ date <date> 2013-07-06, 2013-07-06, 2013-07-06, 2013-07-06, 2013-07-…
Map it:
gg_kfs() +
geom_point(data=seabirds,
mapping=aes(x=lon,
y=lat,
color = sp1,
size = best),
alpha=.4) +
xlab(NULL) + ylab(NULL)
Column definitions:
date
: Date and time, in format
YYYY-MM-DD HH:MM:SS
.
lat
: Latitude in decimal
degrees.
lon
: Longitude in decimal
degrees.
effort
: Code for search effort:
"off"
= opportunistic sighting occurring while off search
effort; "casual"
= opportunistic sighting occurring while
transiting area while off seaerch effort; "station"
=
opportunistic sighting while conducting sampling at oceanographic
stations; "with whale"
= opportunistic sighting while
conducting focal follows of whales; "transect"
= seabird
detection during systematic line-transect effort. Analyses focused on
density/abundance estimation should filter to "transect"
effort only.
bft
: Beaufort sea state, ranging
from 0
(glassy calm) to 4
(white caps
everywhere); this may be a useful covariate in a density/abundance
model.
precip
: Precipitation state;
another potentially useful covariate.
knots
: Ship speed in
knots.
hdg
: Ship heading (range = 0 - 360
degrees).
wind_kph_raw
: Apparent wind speed,
in kilometers per hour, as received by the shipboard weather station,
unadjusted for ship speed or heading.
wind_hdg_raw
: Apparent wind
heading, in degrees, as received by the shipboard weather station,
unadjusted for ship speed or heading.
zone
: Category of distance from the
vessel’s track line, as estimated by a handheld rangefinder: Zone
"1"
is 0 - 75m, Zone "2"
is 75m - 150m. The
zone "OUT"
means the bird was estimated to be beyond 150m.
The zone "221"
means that the bird was originally seen in
Zone 2
but it approached as near as Zone 1 (either due to
bird movement or ship approaching the bird). We recorded this note to
test for a couple things: (1) If more birds are being observed in Zone 1
vs. Zone 2, that may mean that we are missing some birds in Zone 2 and
our assumption that our strip width is 150m on either side of the vessel
may be problematic; and (2) If more birds are being observed in Zone 2
than Zone 1, that may mean that birds are avoiding the vessel and some
could even be flushing beyond Zone 2, which could also be problematic
for density estimation.
line
: An indication of whether
there was uncertainty about which zone
the bird occurred
within; if MIDL
, the bird occurred right on the line
between Zone 1 and Zone 2 by the observer’s estimation; if
OUTR
, the bird occurred right on the line between Zone 2
and beyond. If your analysis assumed a strip width of 150m on each side
of the vessel, the sightings marked OUTR
may be of special
interest: how drastically would your density estimates change if those
sightings were included or excluded?
side
: The side of the vessel
(PORT
or STARBOARD
) the bird was seen on. If
the value is HELM
, that means the researcher at the data
entry position at the helm detected the sighting. Likely not relevant to
basic density/abundance estimation analyses.
best
: Best estimate of total flock
size.
min
: Minimum estimate of total
flock size.
max
: Maximum estimate of total
flock size.
feed
: Indication of whether or not
the birds are feeding: "M"
= Maybe; "N"
= No;
"S"
= Some; "W"
= Unknown; "Y"
=
Yes.
motion
: Indication of bird’s motion
behavior: "FLUSH"
= Birds were originally sitting but were
flushed (either dove or flew off) in response to the research vessel;
"FLY"
= Birds were in flight; "FOLO"
= Birds
are following the research vessel (problematic for recounting!);
"RAFT"
= Birds were rafting on surface debris, such as
floating kelp or logs; "SIT"
= Birds were sitting on the
water.
dir
: Indication of bird’s flight
direction (if flying); cardinal directions are typically used
("N"
, "NE"
, "E"
,
"SE"
, "S"
, "SW"
,
"W"
, "NW"
) with the exception of birds
circling the vessel ("CIR"
or "CR"
).
height
: Estimate of bird’s flight
height above sea level, in meters. This is relevant because wind speed
increases predictably with height above sea level.
sp1
: Four-letter species code for
primary (and possibly only) species in the flock.
per1
: Percentage of total flock
size that sp1
comprises.
plum1
: Primary plumage state for
sp1
. Note that, in some cases, if a single species occurs
in several different plumage state, the same species might be entered as
sp1
, sp2
, and even sp3
. Codes:
"AB"
or "ABE"
= Adult breeding, either/both
sex(es); "ABF"
= Adult breeding, female; "ABM"
= Adult breeding, male; "ANE"
= Adult non-breeding,
either/both sex(es); "FLE"
= Fledgling; "JV"
or "JVE"
= Juvenile; "MX"
= Mixture of
unspecified plumages present; "NBE"
= Non-breeding,
either/both sex(es); "OTH"
= Other unspecified plumage;
"SA"
= Sub-adult plumage; "AAE"
or
"AAM"
= unknown.
sp2
: If this is a mixed-species
flock with two or more species present, this is the four-letter species
code for the second species.
per2
: Percentage of total flock
size that sp2
comprises, if present.
plum2
: Primary plumage state for
sp2
, if present.
sp3
: If this is a mixed-species
flock with three species present, this is the four-letter species code
for the third species.
per3
: Percentage of total flock
size that sp3
comprises, if present.
plum3
: Primary plumage state for
sp3
, if present.
data(salmon)
salmon %>% glimpse
## Rows: 1,033
## Columns: 14
## $ date <dttm> 2013-07-06 07:56:28, 2013-07-06 08:25:17, 2013-07-06 09:12:13…
## $ lat <dbl> 53.38662, 53.37430, 53.37436, 53.37551, 53.38709, 53.38761, 53…
## $ lon <dbl> -129.1690, -129.1522, -129.1522, -129.1383, -129.1353, -129.13…
## $ effort <chr> "off", "casual", "transect", "transect", "transect", "transect…
## $ echo <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ circuit <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ bft <dbl> NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ precip <chr> NA, "clear", "clear", "clear", "clear", "clear", "clear", "cle…
## $ knots <dbl> NA, NA, 4.0, 4.0, 5.8, 5.8, 5.8, 5.8, 5.8, 5.8, 5.8, 4.0, NA, …
## $ hdg <dbl> NA, NA, 357, 357, NA, NA, NA, NA, NA, NA, NA, 49, NA, NA, NA, …
## $ jumps <dbl> 1, 5, 1, 1, 2, 1, 2, 1, 0, 1, 1, 2, 2, NA, 1, 1, 1, 1, 1, 3, N…
## $ zone <dbl> NA, NA, NA, NA, NA, 12, NA, 12, 12, 12, 12, 12, 12, 12, NA, 12…
## $ line <chr> "MIDL", NA, NA, NA, NA, NA, NA, NA, NA, "MIDL", "MIDL", NA, NA…
## $ side <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
Map it;
gg_kfs() +
geom_point(data=salmon,
mapping=aes(x=lon,
y=lat,
size = jumps),
alpha=.4) +
xlab(NULL) + ylab(NULL)
ggplot(salmon %>%
filter(zone < 3) %>%
mutate(Zone = factor(zone)),
aes(x=Zone)) +
geom_bar(stat='count')
data(whale_sightings)
whale_sightings %>% glimpse
## Rows: 549
## Columns: 15
## $ day <dbl> 20130706, 20130706, 20130713, 20130713, 20130713, 20130713, 2…
## $ datetime <dttm> 2013-07-06 09:51:38, 2013-07-06 11:06:33, 2013-07-13 13:03:4…
## $ x <dbl> -129.1201, -129.1180, -129.2433, -129.2839, -129.3088, -129.2…
## $ y <dbl> 53.42348, 53.42000, 53.33910, 53.35212, 53.31438, 53.28156, 5…
## $ distance <dbl> NA, NA, NA, NA, 0.37324576, 0.43999272, NA, NA, NA, NA, NA, N…
## $ size <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2…
## $ spp <chr> "HW", "HW", "HW", "HW", "HW", "BW", "BW", "BW", "BW", "HW", "…
## $ block <chr> "VER", "VER", "WRI", "WRI", "WRI", "WRI", "WRI", "WRI", "WRI"…
## $ bft <dbl> 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1…
## $ seg_id <dbl> 2, 3, 26, 26, 28, 29, 29, 29, 29, 29, 29, 30, 30, 33, 34, 34,…
## $ z <dbl> 273.0400, 315.2533, 508.5700, 511.5049, 534.4100, 507.9533, 5…
## $ zmin <dbl> 0.0088041, 0.0199857, 433.0874939, 119.7148972, 375.9632874, …
## $ zmax <dbl> 324.5167, 342.9700, 514.0425, 522.6160, 538.5000, 542.8300, 5…
## $ zsd <dbl> 86.327932, 59.499729, 28.563368, 90.137482, 24.123574, 164.44…
## $ zrange <dbl> 324.50789, 342.95002, 80.95499, 402.90113, 162.53671, 542.811…
gg_kfs() +
geom_point(data=whale_sightings,
mapping=aes(x=x,
y=y,
color = spp,
size = size),
alpha=.4) +
xlab(NULL) + ylab(NULL)
This dataset is an interpolated grid of oceanographic values for each circuit in each year. Some variables are from the thermosalinograph we had running at the surface during transects, some are from the echosounder we used while underway, and others are from the Seabird Electronics CTD we used at the grid of oceanographic stations.
data(ocean)
ocean %>% glimpse
## Rows: 1,488,101
## Columns: 7
## $ year <chr> "2013", "2013", "2013", "2013", "2013", "2013", "2013", "2013"…
## $ circuit <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1…
## $ metric <chr> "MLD", "MLD", "MLD", "MLD", "MLD", "MLD", "MLD", "MLD", "MLD",…
## $ lat <dbl> 53.54780, 53.54780, 53.54780, 53.54780, 53.54780, 53.54780, 53…
## $ lon <dbl> -129.0307, -129.0262, -129.0218, -129.0173, -129.0128, -129.00…
## $ value <dbl> 1.4525017, 1.1519441, 1.1362478, 1.1260875, 0.8683625, 1.11903…
## $ color <chr> "#A80000", "#930000", "#930000", "#930000", "#930000", "#93000…
Here are the variables included:
ocean$metric %>% unique %>% sort
## [1] "BSM" "Chl.max" "Chl.max.z" "Chl.sum" "Int033" "Int200"
## [7] "MLD" "Sbot" "Sdeep" "Secchi" "SSS" "SST"
## [13] "Stop" "Str.all" "Str.bot" "Str.top" "Tbot" "TC.str"
## [19] "TC.z" "Tdeep" "Ttop"
As an example, say we wanted a map of the sea surface salinity
(SSS
) for each circuit of the 2015 season.
# Filter the dataset
sss <-
ocean %>%
filter(metric == 'SSS',
year==2015)
# Map it
gg_kfs() +
geom_point(data=sss,
mapping=aes(x=lon,
y=lat,
color=value),
size=.05) +
facet_wrap(~circuit) +
xlab(NULL) + ylab(NULL) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()) +
scale_color_gradientn(colours = rev(rainbow(5)))
As another example, here is integrated 200kHz backscatter (which is a useful proxy for euphausiids) in the two circuits completed in 2014:
# Filter the dataset
krill <-
ocean %>%
filter(metric == 'Int200')
# Map it
gg_kfs() +
geom_point(data=krill,
mapping=aes(x=lon,
y=lat,
color=value),
size=.05) +
facet_wrap(~circuit) +
xlab(NULL) + ylab(NULL) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()) +
scale_color_gradientn(colours = rev(rainbow(5)),
trans='log', breaks=c(2.5, 25, 250, 2500))
in_block()
A way to see which geostratum a pair of coordinates is located within.
in_block(x= -129.4, y=53.2)
## [1] "-SQUN"
in_block(x= -129.2, y=52.9)
## [1] "-CAAS"
in_kfs()
A quick test to see if coordinates properly within the water within
the boundaries of the KFS (as defined by the Bangarang
project). The function returns the dataset with a new column,
inkfs
:
test <- in_kfs(seabirds %>% rename(x=lon, y=lat),
toplot = TRUE)
test$inkfs %>% table
## .
## FALSE TRUE
## 6 3616
in_water()
A quick test to see if coordinates properly within the water within
the water (and not on land by mistake). The function returns the dataset
with a new column, valid
:
test <- in_water(seabirds %>% rename(x=lon, y=lat),
toplot = TRUE)
test$valid %>% table
## .
## TRUE
## 3589
whale_map()
Calculates the true position of a sighting within the KFS, in offshore or confined coastal waters, accounting for whether or not the observer is using the horizon or a shoreline as the basis for the reticle reading.
The X
and Y
you supply is the observer’s
location (either a boat or a stationary field station).
whalemap(X= -129.2,
Y=52.9,
bearing = 313,
reticle = 0.2,
eye.height = 2.1,
vessel.hdg = 172,
toplot=TRUE)
## $X
## [1] -129.2168
##
## $Y
## [1] 52.90468
##
## $radial.dist
## [1] 1.243924
##
## $boundary
## [1] "horizon"
##
## $boundary.dist
## [1] 5.172833
##
## $perp.dist
## [1] 1.046189
##
## $track.bearing
## [1] 122.75
To showcase some other functions in the bangarang
package, here is an example workflow for estimating the
density/abundance of a seabird from the Bangarang study area. I
use a density-surface-modeling (DSM) approach based on the
R
package dsm
(Github). In a DSM,
species density is estimated for each point in a grid of your study
area. The bangarang
package provides a 1-km2
grid as a built-in dataset:
data(grid, package='bangarang')
grid %>% head
## # A tibble: 6 × 14
## y1 y2 y x1 x2 x km2 id z zmin zmax zsd
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 52.8 52.8 52.8 -129. -129. -129. 1.02 466 164. 0.000000500 210. 59.9
## 2 52.8 52.8 52.8 -129. -129. -129. 1.02 467 223. 12.5 238. 44.0
## 3 52.8 52.8 52.8 -129. -129. -129. 1.02 475 45.0 0 79.5 21.5
## 4 52.8 52.8 52.8 -129. -129. -129. 1.02 476 68.7 45.8 84.8 6.76
## 5 52.8 52.8 52.8 -129. -129. -129. 1.02 477 77.2 69.2 95.6 6.13
## 6 52.8 52.8 52.8 -129. -129. -129. 1.02 478 97.6 81.8 126. 11.2
## # ℹ 2 more variables: zrange <dbl>, block <chr>
Here is a map of the centroid of each grid cell:
gg_kfs() +
geom_point(data=grid, mapping=aes(x=x, y=y), size=.2)
The package also offers a dataset of systematic transect effort split into 5-km segments, a common length in habitat modeling:
data(segments_5km)
# Rename
segments <- segments_5km
# Check out
segments %>% head
## # A tibble: 6 × 18
## Sample.Label Effort x y datetime year day block z
## <dbl> <dbl> <dbl> <dbl> <dttm> <dbl> <dbl> <chr> <dbl>
## 1 1 5.00 -129. 53.4 2013-07-06 09:09:01 2013 20130706 VER 330.
## 2 2 5.01 -129. 53.4 2013-07-06 09:55:43 2013 20130706 VER 316.
## 3 3 5.00 -129. 53.4 2013-07-06 11:19:48 2013 20130706 VER 220.
## 4 4 5.00 -129. 53.5 2013-07-06 12:14:16 2013 20130706 VER 168.
## 5 5 5.01 -129. 53.5 2013-07-06 14:11:25 2013 20130706 VER 102.
## 6 6 5.00 -129. 53.5 2013-07-06 14:42:32 2013 20130706 VER 131.
## # ℹ 9 more variables: zmin <dbl>, zmax <dbl>, zsd <dbl>, zrange <dbl>,
## # yday <dbl>, month <dbl>, circuit <dbl>, circuit_name <fct>,
## # circuit_yday <dbl>
# View centroid of each segment
gg_kfs() +
geom_point(data=segments, mapping=aes(x=x, y=y), alpha=.5)
We then filter the built-in sightings dataset,
"seabirds"
, to our species of interest. In this case,
marbled murrelet (4-letter code: “MAMU”). We will analyze
sitting/flushed birds separately from flying birds, since the former
will be relevant to subsequent foraging preference analyses.
data(seabirds)
mamu <-
seabirds %>%
filter(effort == 'transect') %>%
filter(zone != 'OUT') %>%
# filter to any sighting that had MAMU in any of the three species slots
filter(sp1 == 'MAMU' | sp2 == 'MAMU' | sp3 == 'MAMU') %>%
filter(motion %in% c('SIT', 'FLUSH')) %>%
# provide a single authoritative species ID
mutate(spp = 'MAMU')
# Check sample size
mamu %>% nrow
## [1] 235
# Review data
mamu %>% as.data.frame %>% head
## datetime lat lon effort bft precip knots hdg
## 1 2014-08-11 12:42:14 53.42632 -129.0965 transect 3 haze 5.1 270.3
## 2 2014-08-11 13:57:59 53.40101 -129.1281 transect 1 haze 6.4 189.3
## 3 2014-08-12 11:24:55 53.28757 -129.4219 transect 0 clear 5.1 108.6
## 4 2014-08-14 08:37:25 53.09309 -129.1341 transect 0 clear 5 69.9
## 5 2014-08-20 08:16:01 53.18610 -129.5766 transect 1 clear 5.9 265.2
## 6 2014-08-20 11:32:14 53.10107 -129.5686 transect 1 clear 5.3 193.4
## wind_kph_raw wind_hdg_raw zone line side best min max feed motion dir height
## 1 29.9316 163.3 1 <NA> PORT 2 2 2 M SIT <NA> NA
## 2 5.4000 351.2 1 <NA> STAR 3 3 3 M SIT <NA> NA
## 3 6.6000 10.3 2 <NA> STAR 1 1 1 M FLUSH <NA> NA
## 4 7.2000 2.3 1 <NA> PORT 1 1 1 N SIT <NA> NA
## 5 10.2000 9.3 2 <NA> STAR 1 1 1 N SIT <NA> NA
## 6 9.7000 28.7 1 <NA> STAR 1 1 1 M SIT <NA> NA
## sp1 per1 plum1 sp2 per2 plum2 sp3 per3 plum3 date spp
## 1 MAMU 100 <NA> <NA> 0 <NA> <NA> 0 <NA> 2014-08-11 MAMU
## 2 MAMU 100 AB <NA> 0 <NA> <NA> 0 <NA> 2014-08-11 MAMU
## 3 MAMU 100 SA <NA> 0 <NA> <NA> 0 <NA> 2014-08-12 MAMU
## 4 MAMU 100 SA <NA> 0 <NA> <NA> 0 <NA> 2014-08-14 MAMU
## 5 MAMU 100 AB <NA> 0 <NA> <NA> 0 <NA> 2014-08-20 MAMU
## 6 MAMU 100 <NA> <NA> 0 <NA> <NA> 0 <NA> 2014-08-20 MAMU
# Map data
gg_kfs() +
geom_point(data=mamu, mapping=aes(x=lon, y=lat, size=best), alpha=.5)
We now need to determine which effort segment each sighting occurred
during. To do so, use the function segment_sync()
, which
uses time differences to find the segment that overlaps with each
sighting’s timestamp:
# Sync segment numbers to sightings
mamu <- segment_sync(segments = segments,
sightings = mamu)
We then use the format_observation_data()
function to
transform the mamu
dataset to meet formatting requirements
for the DSM:
observation_data <- format_observation_data(mamu)
observation_data %>% as.data.frame %>% head
## datetime lat lon effort bft precip knots hdg
## 1 2014-08-11 12:42:14 53.42632 -129.0965 transect 3 haze 5.1 270.3
## 2 2014-08-11 13:57:59 53.40101 -129.1281 transect 1 haze 6.4 189.3
## 3 2014-08-12 11:24:55 53.28757 -129.4219 transect 0 clear 5.1 108.6
## 4 2014-08-14 08:37:25 53.09309 -129.1341 transect 0 clear 5 69.9
## 5 2014-08-20 08:16:01 53.18610 -129.5766 transect 1 clear 5.9 265.2
## 6 2014-08-20 11:32:14 53.10107 -129.5686 transect 1 clear 5.3 193.4
## wind_kph_raw wind_hdg_raw zone line side best min max feed motion dir height
## 1 29.9316 163.3 1 <NA> PORT 2 2 2 M SIT <NA> NA
## 2 5.4000 351.2 1 <NA> STAR 3 3 3 M SIT <NA> NA
## 3 6.6000 10.3 2 <NA> STAR 1 1 1 M FLUSH <NA> NA
## 4 7.2000 2.3 1 <NA> PORT 1 1 1 N SIT <NA> NA
## 5 10.2000 9.3 2 <NA> STAR 1 1 1 N SIT <NA> NA
## 6 9.7000 28.7 1 <NA> STAR 1 1 1 M SIT <NA> NA
## sp1 per1 plum1 sp2 per2 plum2 sp3 per3 plum3 date spp
## 1 MAMU 100 <NA> <NA> 0 <NA> <NA> 0 <NA> 2014-08-11 12:42:14 MAMU
## 2 MAMU 100 AB <NA> 0 <NA> <NA> 0 <NA> 2014-08-11 13:57:59 MAMU
## 3 MAMU 100 SA <NA> 0 <NA> <NA> 0 <NA> 2014-08-12 11:24:55 MAMU
## 4 MAMU 100 SA <NA> 0 <NA> <NA> 0 <NA> 2014-08-14 08:37:25 MAMU
## 5 MAMU 100 AB <NA> 0 <NA> <NA> 0 <NA> 2014-08-20 08:16:01 MAMU
## 6 MAMU 100 <NA> <NA> 0 <NA> <NA> 0 <NA> 2014-08-20 11:32:14 MAMU
## Sample.Label object size distance
## 1 169 1 2 75
## 2 170 2 3 75
## 3 174 3 1 150
## 4 186 4 1 75
## 5 211 5 1 150
## 6 215 6 1 75
Since we used strip-width sampling in seabird surveys aboard the Bangarang, we will create a dummy detection function that stands in for this aspect of the DSM analysis:
dummy_ds <- create_dummy_detection_function(observation_data)
We can now develop our DSM using the function dsm()
from
package dsm
. Here is a basic example that models density
according to a spline interaction of longitude and latitude. (The
convert.units
input handles the fact that, in our
seabirds
dataset, distance is specified in meters instead
of kilometers.).
library(dsm)
dsm_basic <-
dsm(formula = density.est ~ s(x,y),
ddf.obj = dummy_ds,
convert.units=1/1000,
segment.data = segments,
observation.data = observation_data)
Here is a summary of that model:
dsm_basic %>% summary
##
## Family: quasipoisson
## Link function: log
##
## Formula:
## density.est ~ s(x, y)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.0613 0.2508 -8.22 1.01e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(x,y) 22.97 26.46 14.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.168 Deviance explained = 51.5%
## -REML = 726.52 Scale est. = 5.2335 n = 713
We then use that DSM to predict murrelet density on our spatial grid:
d <- predict_dsm_kfs(dsm_basic, segments, grid)
That function, predict_dsm_kfs()
, returns a vector of
density estimates (birds km-2) equal in length to the number
of rows in the grid
data.frame
.
gg_kfs() +
geom_point(data=grid %>% mutate(ds=d),
mapping=aes(x=x, y=y, color=ds), pch=15, size=2) +
scale_color_viridis_c(trans='log', breaks=c(0.02, .2, 2, 20)) +
labs(color = 'Density')
To improve the DSM model, you can introduce covariates from your
segments
dataset. The bangarang
package
provides a function, dsm_comps()
, that will use
AICc to rank candidate models for all possible combinations of
the supplied covariates. The analyst can then decide which formula to
bring into the density estimation stage. The assumed distribution family
is "tweedie"
.
dsm_comps <-
dsm_compare(segments,
observation_data,
dummy_ds,
candidate_splines = c('s(yday)', 's(z)', 'year'),
base_formula = 'density.est ~ s(x,y)')
That process produces the following summary:
dsm_comps$dsm
## n formula model_id AICc dev.expl
## 1 3 density.est ~ s(x,y) + s(yday) + s(z) + year 7 1136.953 0.6915890
## 2 2 density.est ~ s(x,y) + s(yday) + year 5 1151.730 0.6579551
## 3 2 density.est ~ s(x,y) + s(yday) + s(z) 4 1152.788 0.6723841
## 4 1 density.est ~ s(x,y) + s(yday) 1 1165.685 0.6379926
## 5 2 density.est ~ s(x,y) + s(z) + year 6 1193.288 0.5885648
## 6 1 density.est ~ s(x,y) + year 3 1203.471 0.5496048
## 7 1 density.est ~ s(x,y) + s(z) 2 1252.977 0.4971941
Based on this summary, we can get details on the most parsimonious model:
dsm_comps$objects[[7]] %>% summary
##
## Family: Tweedie(p=1.374)
## Link function: log
##
## Formula:
## density.est ~ s(x, y) + s(yday) + s(z) + year
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2269.8085 504.1799 -4.502 7.93e-06 ***
## year 1.1254 0.2503 4.497 8.11e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(x,y) 18.881 23.395 10.226 < 2e-16 ***
## s(yday) 7.979 8.676 9.616 < 2e-16 ***
## s(z) 6.126 7.264 3.144 0.00263 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.63 Deviance explained = 69.2%
## -REML = 586.03 Scale est. = 4.4529 n = 712
And we will save that model object as our model to keep:
dsm_keep <- dsm_comps$objects[[7]]
Our best-fit model involves date components. So, to produce the point
estimate, we will use a wrapper function,
predict_circuits()
, to predict the density surface during
each circuit of the study area. We completed a total of 9 circuits whose
information is contained in the built-in dataset
data(circuits)
:
data(circuits)
circuits %>% head
## # A tibble: 6 × 7
## circuit year begin end name
## <dbl> <dbl> <dttm> <dttm> <chr>
## 1 1 2013 2013-07-06 00:01:00 2013-07-27 23:59:00 Jul '13
## 2 2 2013 2013-08-10 00:01:00 2013-09-04 23:59:00 Aug '13
## 3 3 2014 2014-08-11 00:01:00 2014-08-28 23:59:00 Aug '14
## 4 4 2014 2014-08-29 00:01:00 2014-09-14 23:59:00 Sep ' 14
## 5 5 2015 2015-05-24 00:01:00 2015-06-14 23:59:00 May-Jun '15
## 6 6 2015 2015-06-15 00:01:00 2015-07-10 23:59:00 Jun-Jul '15
## # ℹ 2 more variables: middate <dttm>, middate_yday <dbl>
We now rerun the model:
d_circuits <-
predict_circuits(dsm_keep,
segments,
grid,
toplot=FALSE)
We can use the function point_summary()
to calculate
average density across the DSM:
point_summary(d_circuits)
## # A tibble: 9 × 6
## year circuit circuit_name circuit_yday dmean dsd
## <dbl> <dbl> <fct> <dbl> <dbl> <dbl>
## 1 2013 1 Jul '13 198 0.369 2.23
## 2 2013 2 Aug '13 235 0.0363 0.219
## 3 2014 3 Aug '14 232 0.300 1.81
## 4 2014 4 Sep ' 14 249 0.174 1.05
## 5 2015 5 May-Jun '15 155 1.26 7.64
## 6 2015 6 Jun-Jul '15 179 3.95 23.9
## 7 2015 7 Late Jul '15 202 7.16 43.3
## 8 2015 8 Aug '15 226 0.934 5.65
## 9 2015 9 Sep '15 252 0.498 3.01
Here is a plot of it:
gg_kfs() +
geom_point(data=d_circuits %>% mutate(ds=d),
mapping=aes(x=x, y=y, color=ds), pch=15) +
scale_color_viridis_c(trans='log', breaks=c(0.01, .1, 1, 10, 100)) +
facet_wrap(~circuit_name) +
xlab(NULL) + ylab(NULL) + labs(color='Density')
To convert this DSM density estimate to a total abundance estimate, we take the average density of the grid and multiply it by the study area:
point_summary(d_circuits) %>%
mutate(N = round(dmean * sum(grid$km2)))
## # A tibble: 9 × 7
## year circuit circuit_name circuit_yday dmean dsd N
## <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 2013 1 Jul '13 198 0.369 2.23 525
## 2 2013 2 Aug '13 235 0.0363 0.219 52
## 3 2014 3 Aug '14 232 0.300 1.81 427
## 4 2014 4 Sep ' 14 249 0.174 1.05 247
## 5 2015 5 May-Jun '15 155 1.26 7.64 1796
## 6 2015 6 Jun-Jul '15 179 3.95 23.9 5619
## 7 2015 7 Late Jul '15 202 7.16 43.3 10186
## 8 2015 8 Aug '15 226 0.934 5.65 1328
## 9 2015 9 Sep '15 252 0.498 3.01 708
To quantify the uncertainty in our point estimate, the
bangarang
package provides a bootstrap function.
First, we store the formula used in the best-fit DSM:
(dsm_formula <- summary(dsm_keep)$formula)
## density.est ~ s(x, y) + s(yday) + s(z) + year
Next, we can use the function bootstrapper_wrapper()
to
repeat the circuit density prediction routine with iteratively resampled
versions of the segments and sightings data. (The norm is typically
1,000 or more bootstrap iterations or more – prepare for a multi-hour
run! I recommend trying this out with just 50 iterations at first to
confirm that the results make sense before investing in the full-fledged
run – bird pun intended!).
bootstraps <-
bootstrapper_wrapper(dsm_formula,
segments,
observation_data,
grid,
group_draw = TRUE,
on_the_line = FALSE,
B = 1000,
output_filename = NULL)
load('demo_boots.rds')
(See the documentation for that function for more help.)
These results be summarized using the function
bootstrapper_summary()
:
bootstrapper_summary(bootstraps)
## # A tibble: 9 × 7
## year circuit circuit_name circuit_yday SD L95 U95
## <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 2013 1 Jul '13 198 1.31 0.0657 4.00
## 2 2013 2 Aug '13 235 0.191 0.00353 0.434
## 3 2014 3 Aug '14 232 1.12 0.0557 2.64
## 4 2014 4 Sep ' 14 249 0.668 0.0299 1.83
## 5 2015 5 May-Jun '15 155 6.35 0.303 13.4
## 6 2015 6 Jun-Jul '15 179 14.4 1.11 38.4
## 7 2015 7 Late Jul '15 202 16.6 1.41 48.5
## 8 2015 8 Aug '15 226 3.80 0.240 8.07
## 9 2015 9 Sep '15 252 2.17 0.0909 5.45
That bootstrap result can be combined with the point estimates as follows:
result <-
left_join(point_summary(d_circuits) %>% select(-dsd),
(bootstrapper_summary(bootstraps) %>% select(circuit_name, SD, L95, U95)),
by='circuit_name') %>%
mutate(CV = SD / dmean)
result
## # A tibble: 9 × 9
## year circuit circuit_name circuit_yday dmean SD L95 U95 CV
## <dbl> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2013 1 Jul '13 198 0.369 1.31 0.0657 4.00 3.56
## 2 2013 2 Aug '13 235 0.0363 0.191 0.00353 0.434 5.27
## 3 2014 3 Aug '14 232 0.300 1.12 0.0557 2.64 3.74
## 4 2014 4 Sep ' 14 249 0.174 0.668 0.0299 1.83 3.85
## 5 2015 5 May-Jun '15 155 1.26 6.35 0.303 13.4 5.03
## 6 2015 6 Jun-Jul '15 179 3.95 14.4 1.11 38.4 3.64
## 7 2015 7 Late Jul '15 202 7.16 16.6 1.41 48.5 2.32
## 8 2015 8 Aug '15 226 0.934 3.80 0.240 8.07 4.07
## 9 2015 9 Sep '15 252 0.498 2.17 0.0909 5.45 4.35
This result can be plotted as follows:
ggplot(result %>%
mutate(circuit_name = factor(circuit_name, levels=circuit_name)),
aes(x=circuit_name,
y=dmean,
ymin=L95,
ymax=U95)) +
geom_errorbar(width=.1, color='grey40') +
geom_point(size=2.5) +
xlab(NULL) +
ylab('Density') +
theme_light()
Flying birds pose special challenges (see Spear, Nur & Ainley 1992, Spear & Ainley 1997, and this white paper on handling the issues involved). In order to account for the bias caused by the possibliity of repeat-counting a flying bird, we need to perform corrections that rely upon ship speed and direction; bird speed, direction and flight height; and published regressions that predict bird speed for different seabird groups at certain orientations to the wind and at certain heights above the water.
So first we get the flying murrelets in our dataset:
data(seabirds)
mamu <-
seabirds %>%
filter(effort == 'transect') %>%
filter(zone != 'OUT') %>%
filter(sp1 == 'MAMU' | sp2 == 'MAMU' | sp3 == 'MAMU') %>%
filter(motion %in% c('FLY')) %>%
mutate(spp = 'MAMU')
# Sample size
mamu %>% nrow
## [1] 149
To account for the biases introduced by flying bird “flux”, we bring in a built-in dataset that contains flight-wind regression data from Spear & Ainley (1997):
data(flight_groups)
flight_groups %>% head
## # A tibble: 5 × 14
## flight_group spp tail_mean tail_sd head_mean head_sd cross_mean cross_sd
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 VIII.23 COMU 22.1 11.3 14.5 10 18.8 9.4
## 2 VIII.24 RHAU 19.4 2.5 14.6 1.7 16.3 1.2
## 3 VIII.24 PIGU 19.4 2.5 14.6 1.7 16.3 1.2
## 4 VIII.25 CAAU 19 3.7 13.1 0.9 13.2 1.4
## 5 VIII.26 MAMU 35.2 31.7 22.4 7.7 22.6 12
## # ℹ 6 more variables: tail_b <dbl>, tail_a <dbl>, head_b <dbl>, head_a <dbl>,
## # cross_b <dbl>, cross_a <dbl>
We then use that regression data in the function
calculate_flight_flux()
, which scales group size by the
flux correction factor (see this
white paper for details):
# Mean group size before flux correction
mamu$best %>% mean(na.rm=TRUE)
## [1] 2.503356
# Calculate flux correction
mamu <- calculate_flight_flux(mamu, flight_groups)
# Apply to group estimates
mamu <-
mamu %>% mutate(best = best * correction_factor,
min = min * correction_factor,
max = max * correction_factor)
# Mean after
mamu$best %>% mean(na.rm=TRUE)
## [1] 0.2621422
Now that flux is accounted for, you can proceed using the same method for sitting birds above.
Total murrelet density will be the sum of sitting and flying birds.