| Title: | High-Performance GPS to GTFS Converter |
|---|---|
| Description: | Preprocesses raw GPS trajectory data of public transit and transforms it to GTFS format. Provides a high-performance R port of the 'gps2gtfs' Python package by Aaivu (Ratneswaran et al., 2023) <doi:10.1109/ICCT56969.2023.10075789>. Heavy computational tasks are offloaded to an extremely fast compiled Rust backend or a C++ (Rcpp) backend. Automatic backend selection prefers Rust, then Rcpp, and falls back to a slower pure R implementation utilizing 'data.table' and 'sf'. |
| Authors: | Egor Kotov [aut, cre, cph] (ORCID: <https://orcid.org/0000-0001-6690-5345>), Aaivu [cph] |
| Maintainer: | Egor Kotov <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-07 06:06:49 UTC |
| Source: | https://github.com/e-kotov/gps2gtfs |
Cleans raw GPS data by removing records with latitude or longitude equal to zero, parsing the device time, and sorting by device ID, date, and time.
g2g_clean_gps(raw_gps_df, projected = NULL)g2g_clean_gps(raw_gps_df, projected = NULL)
raw_gps_df |
A data.frame containing raw GPS data. Must include columns:
|
projected |
Logical. Is the coordinates data already projected? Default is NULL (auto-detect). |
A sorted data.table with additional date and time_str columns.
data(g2g_data_gps) cleaned_gps <- g2g_clean_gps(g2g_data_gps) head(cleaned_gps)data(g2g_data_gps) cleaned_gps <- g2g_clean_gps(g2g_data_gps) head(cleaned_gps)
A lightweight subset of raw bus GPS data from Kandy, Sri Lanka.
This dataset is intended for examples and testing of the gps2gtfs pipeline.
g2g_data_gpsg2g_data_gps
A data frame with 1045 rows and 6 variables:
Unique identifier for the GPS record.
Unique identifier for the tracking device (bus).
Timestamp of the GPS ping (UTC).
Latitude in WGS-84 degrees.
Longitude in WGS-84 degrees.
Recorded speed of the vehicle.
Original data from the Python gps2gtfs package.
Ratneswaran, S., & Thayasivam, U. (2023). Extracting potential Travel time information from raw GPS data and Evaluating the Performance of Public transit - a case study in Kandy, Sri Lanka. 2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT), 1-7. doi:10.1109/ICCT56969.2023.10075789
Sample data containing the coordinates and metadata for bus stops.
This dataset works together with g2g_data_gps to extract stop times.
g2g_data_stopsg2g_data_stops
A data frame with 23 rows and 6 variables:
Unique identifier for the bus stop.
Route identifier the stop belongs to.
Direction of the route the stop serves.
Latitude in WGS-84 degrees.
Longitude in WGS-84 degrees.
Address or name of the stop location.
Original data from the Python gps2gtfs package.
Ratneswaran, S., & Thayasivam, U. (2023). Extracting potential Travel time information from raw GPS data and Evaluating the Performance of Public transit - a case study in Kandy, Sri Lanka. 2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT), 1-7. doi:10.1109/ICCT56969.2023.10075789
Sample data containing the coordinates for bus route terminals. Used to define the start and end of trips.
g2g_data_terminalsg2g_data_terminals
A data frame with 2 rows and 4 variables:
Unique identifier for the bus terminal.
Name of the terminal.
Latitude in WGS-84 degrees.
Longitude in WGS-84 degrees.
Original data from the Python gps2gtfs package.
Ratneswaran, S., & Thayasivam, U. (2023). Extracting potential Travel time information from raw GPS data and Evaluating the Performance of Public transit - a case study in Kandy, Sri Lanka. 2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT), 1-7. doi:10.1109/ICCT56969.2023.10075789
Reads raw GPS and terminal CSV files, extracts trips, and optionally writes results to a CSV.
g2g_extract_trips( gps_data, terminals_data, terminals_buffer_radius, output_path = NULL, projected_crs = NULL, backend = "auto", projected = NULL )g2g_extract_trips( gps_data, terminals_data, terminals_buffer_radius, output_path = NULL, projected_crs = NULL, backend = "auto", projected = NULL )
gps_data |
A data.frame or path to the raw GPS CSV. |
terminals_data |
A data.frame or path to the terminal coordinates CSV. |
terminals_buffer_radius |
Numeric. Buffer radius for terminals (in meters). |
output_path |
Character. Optional path to write output trip features as CSV. Default is |
projected_crs |
Numeric. The EPSG code of a projected coordinate system
used for metric distance calculations. Required when
|
backend |
Character. The backend to use: |
projected |
Logical. Whether plain-table coordinates are already
projected. Out-of-bounds coordinates require explicit |
A data.table containing extracted trip features.
data(g2g_data_gps) data(g2g_data_terminals) trips <- g2g_extract_trips( gps_data = g2g_data_gps, terminals_data = g2g_data_terminals, terminals_buffer_radius = 50 )data(g2g_data_gps) data(g2g_data_terminals) trips <- g2g_extract_trips( gps_data = g2g_data_gps, terminals_data = g2g_data_terminals, terminals_buffer_radius = 50 )
Reads raw GPS, terminal, and stop CSV files, extracts trips and stop times, and optionally writes results.
g2g_extract_trips_and_stop_times( gps_data, terminals_data, stops_data, terminals_buffer_radius, stops_buffer_radius, stops_extended_buffer_radius, output_trips_path = NULL, output_stops_path = NULL, projected_crs = NULL, backend = "auto", projected = NULL, stop_direction_map = NULL )g2g_extract_trips_and_stop_times( gps_data, terminals_data, stops_data, terminals_buffer_radius, stops_buffer_radius, stops_extended_buffer_radius, output_trips_path = NULL, output_stops_path = NULL, projected_crs = NULL, backend = "auto", projected = NULL, stop_direction_map = NULL )
gps_data |
A data.frame or path to the raw GPS CSV. |
terminals_data |
A data.frame or path to the terminal coordinates CSV. |
stops_data |
A data.frame or path to the bus stops coordinates CSV. |
terminals_buffer_radius |
Numeric. Buffer radius for terminals (in meters). |
stops_buffer_radius |
Numeric. Buffer radius for bus stops (in meters). |
stops_extended_buffer_radius |
Numeric. Extended buffer radius for bus stops (in meters). |
output_trips_path |
Character. Optional path to write output trip features as CSV. Default is |
output_stops_path |
Character. Optional path to write output stop times as CSV. Default is |
projected_crs |
Numeric. The EPSG code of a projected coordinate system
used for metric distance calculations. Required when
|
backend |
Character. The backend to use: |
projected |
Logical. Whether plain-table coordinates are already
projected. Out-of-bounds coordinates require explicit |
stop_direction_map |
Optional named character vector mapping each raw stop-direction label to its starting terminal ID. |
A list containing two data.tables: trips and stop_times (with columns matching GTFS standard naming).
data(g2g_data_gps) data(g2g_data_terminals) data(g2g_data_stops) result <- g2g_extract_trips_and_stop_times( gps_data = g2g_data_gps, terminals_data = g2g_data_terminals, stops_data = g2g_data_stops, terminals_buffer_radius = 50, stops_buffer_radius = 30, stops_extended_buffer_radius = 50 ) head(result$trips) head(result$stop_times)data(g2g_data_gps) data(g2g_data_terminals) data(g2g_data_stops) result <- g2g_extract_trips_and_stop_times( gps_data = g2g_data_gps, terminals_data = g2g_data_terminals, stops_data = g2g_data_stops, terminals_buffer_radius = 50, stops_buffer_radius = 30, stops_extended_buffer_radius = 50 ) head(result$trips) head(result$stop_times)