Residence Index¶
Kessel et al. Paper https://www.researchgate.net/publication/279269147
This residence index tool will take a compressed or uncompressed detection file and caculate the residency index for each station/receiver in the detections. A CSV file will be written to the data directory for future use. A Pandas DataFrame is returned from the function, which can be used to plot the information. The information passed to the function is what is used to calculate the residence index, make sure you are only passing the data you want taken into consideration for the residence index (i.e. species, stations, tags, etc.).
detections: The CSV file in the data directory that is either compressed or raw. If the file is not compressed please allow the program time to compress the file and add the rows to the database. A compressed file will be created in the data directory. Use the compressed file for any future runs of the residence index function.
calculation_method: The method used to calculate the residence index. Methods are:
- kessel
- timedelta
- aggregate_with_overlap
- aggregate_no_overlap.
project_bounds: North, South, East, and West bounding longitudes and latitudes for visualization.
The calculation methods are listed and described below before they are called. The function will default to the Kessel method when nothing is passed.
Below is an example of inital variables to set up, which are the detection file and the project bounds.
Warning
Input files must include datecollected, station, longitude, latitude, catalognumber, and unqdetecid as columns.
Kessel Residence Index Calculation¶
The Kessel method converts both the startdate and enddate columns into a date with no hours, minutes, or seconds. Next it creates a list of the unique days where a detection was seen. The size of the list is returned as the total number of days as an integer. This calculation is used to determine the total number of distinct days (T) and the total number of distinct days per station (S).
\(RI = \frac{S}{T}\)
RI = Residence Index
S = Distinct number of days detected at the station
T = Distinct number of days detected anywhere on the array
Warning
Possible rounding error may occur as a detection on 2016-01-01 23:59:59
and a detection on 2016-01-02 00:00:01 would be counted as two days when it is really 2-3 seconds.
Example Code¶
Timedelta Residence Index Calculation¶
The Timedelta calculation method determines the first startdate of all detections and the last enddate of all detections. The time difference is then taken as the values to be used in calculating the residence index. The timedelta for each station is divided by the timedelta of the array to determine the residence index.
\(RI = \frac{\Delta S}{\Delta T}\)
RI = Residence Index
\(\Delta S\) = Last detection time at a station - First detection time at the station
\(\Delta T\) = Last detection time on an array - First detection time on the array
Example Code¶
Aggregate With Overlap Residence Index Calculation¶
The Aggregate With Overlap calculation method takes the length of time of each detection and sums them together. A total is returned. The sum for each station is then divided by the sum of the array to determine the residence index.
RI = \(\frac{AwOS}{AwOT}\)
RI = Residence Index
AwOS = Sum of length of time of each detection at the station
AwOT = Sum of length of time of each detection on the array
Example Code¶
Aggregate No Overlap Residence Index Calculation¶
The Aggregate No Overlap calculation method takes the length of time of each detection and sums them together. However, any overlap in time between one or more detections is excluded from the sum.
For example, if the first detection is from 2016-01-01 01:02:43 to 2016-01-01 01:10:12 and the second detection is from 2016-01-01 01:09:01 to 2016-01-01 01:12:43, then the sume of those two detections would be 10 minutes.
A total is returned once all detections of been added without overlap. The sum for each station is then divided by the sum of the array to determine the residence index.
RI = \(\frac{AnOS}{AnOT}\)
RI = Residence Index
AnOS = Sum of length of time of each detection at the station, excluding any overlap
AnOT = Sum of length of time of each detection on the array, excluding any overlap
Example Code¶
Mapbox¶
Alternatively you can use a Mapbox access token plot your map. Mapbox is much for responsive than standard Scattergeo plot.
Example Code¶
Residence Index Functions¶
-
kessel_ri.total_days_diff(detections)¶ Determines the total days difference.
The difference is determined by the minimal startdate of every detection and the maximum enddate of every detection. Both are converted into a datetime then subtracted to get a timedelta. The timedelta is converted to seconds and divided by the number of seconds in a day (86400). The function returns a floating point number of days (i.e. 503.76834).
Parameters: detections – Pandas DataFrame pulled from the compressed detections CSV Returns: An float in the number of days
-
kessel_ri.total_days_count(detections)¶ The function below takes a Pandas DataFrame and determines the number of days any detections were seen on the array.
The function converst both the startdate and enddate columns into a date with no hours, minutes, or seconds. Next it creates a list of the unique days where a detection was seen. The size of the list is returned as the total number of days as an integer.
* NOTE ** Possible rounding error may occur as a detection on 2016-01-01 23:59:59 and a detection on 2016-01-02 00:00:01 would be counted as days when it is really 2-3 seconds.
Parameters: detections – Pandas DataFrame pulled from the compressed detections CSV Returns: An int in the number of days
-
kessel_ri.aggregate_total_with_overlap(detections)¶ The function below aggregates timedelta of startdate and enddate of each detection into a final timedelta then returns a float of the number of days. If the startdate and enddate are the same, a timedelta of one second is assumed.
Parameters: detections – Pandas DataFrame pulled from the compressed detections CSV Returns: An float in the number of days
-
kessel_ri.aggregate_total_no_overlap(detections)¶ The function below aggregates timedelta of startdate and enddate, excluding overlap between detections. Any overlap between two detections is converted to a new detection using the earlier startdate and the latest enddate. If the startdate and enddate are the same, a timedelta of one second is assumed.
Parameters: detections – pandas DataFrame pulled from the compressed detections CSV Returns: An float in the number of days
-
kessel_ri.get_days(dets, calculation_method='kessel')¶ Determines which calculation method to use for the residency index.
Wrapper method for the calulation methods above.
Parameters: - dets – A Pandas DataFrame pulled from the compressed detections CSV
- calculation_method – determines which method above will be used to count total time and station time
Returns: An int in the number of days
-
kessel_ri.get_station_location(station, detections)¶ Returns the longitude and latitude of a station/receiver given the station and the table name.
Parameters: - station – String that contains the station name
- detections – the table name in which to find the station
Returns: A Pandas DataFrame of station, latitude, and longitude
-
kessel_ri.plot_ri(ri_data, ipython_display=True, title='Bubble Plot', height=700, width=1000, plotly_geo=None, filename=None, marker_size=6, colorscale='Viridis', mapbox_token=None)¶ Parameters: - ri_data – A Pandas DataFrame generated from
residency_index() - ipython_display – a boolean to show in a notebook
- title – the title of the plot
- height – the height of the plotl
- width – the width of the plotly
- plotly_geo – an optional dictionary to controle the geographix aspects of the plot
- filename – Plotly filename to write to
- mapbox_token – A string of mapbox access token
- marker_size – An int to indicate the diameter in pixels
- colorscale – A string to indicate the color index
Returns: A plotly geoscatter
- ri_data – A Pandas DataFrame generated from
-
kessel_ri.residency_index(detections, calculation_method='kessel')¶ This function takes in a detections CSV and determines the residency index for reach station.
Residence Index (RI) was calculated as the number of days an individual fish was detected at each receiver station divided by the total number of days the fish was detected anywhere on the acoustic array. - Kessel et al.
Parameters: - detections – CSV Path
- calculation_method – determines which method above will be used to count total time and station time
Returns: A residence index DataFrame with the following columns
- days_detected
- latitude
- longitude
- residency_index
- station