πŸ‡²πŸ‡¬Relevant Datasets

Selected relevant datasets towards automated driving and mixed traffic research

Selected Online Open-source Datasets

Dataset

Data Description

(Type and Volume)

Relevant Tasks

and Case Studies

Data Samples Screenshot

Image & video with annotation;

100K video clips & images,1.8TB

Perception:

Semantic segmentation;

Lane detection

Involves 7 sensor stations equipped with more than 60 SOTA and multi-modal sensors, and covered a road network of approximately 3.5 kilometres, R0, R1, R2 three different data sets

Perception;

Digital Twin;

Motion Prediction;

Motion: TFRecord format with object trajectories and corresponding 3D maps for 103,354 segments;

Perception: Lidar and Camera data, labels for 2,030 segments

Motion: Motion Prediction, Interaction, Occupancy, and Flow Prediction, Sim Agents;

Perception: Segmentation, Object Detection & Tracking, Pose Estimation

170,000 scenes around automated vehicle; 1000+ hours; zarr format with python toolkit

Motion Prediction

104 hours of videos; GPS/IMU, CAN; etc.

HDD: Learning Driver Behaviour; Causal Reasoning; H3D: 3D Multi-Object Detection and Tracking; HSD:Traffic Scene Classification; HEV-I: Vehicle Localization; HAD: Human-to-Vehicle Advice for end-to-end Self-driving; TITAN: Trajectory Forecast

10000 video clips; 12 classes bounding box, tracking ID, class ID;

Currently unreachable

Perception: Object Detection & Tracking

Video, point cloud, GPS, and driver behaviour (speed and wheel); 1000 km

Driving Policy Prediction

Drone-Based Vehicle Trajectory, 1140-minutes of drone videos@30 FPS recorded at 12 different locations

VR Driving Simulation;

Digital Twin;

Sensor Simulation;

Driving Behaviour Analysis;

Safety & Crash Analysis

Drone-based collection; 110500 vehicles; 147 hours; CSV; (Highway, Interaction, Roundabout)

Behaviour Extraction & Analysis; Intention / Behaviour / Motion Prediction;

Imitation Learning;

2 grayscale cameras, 2 color cameras, 4 Edmund optics lenses, 1 3D laser scanner (10 HZ); 6 hours; 50 scenes, 180 GB

Perception: Object Detection & Tracking; Semantic and Instance Segmentation; Road/Lane Detection

1000 driving scenes; 23 object classes annotated with 3D bounding boxes at 2Hz; 1.4M camera images, 390k LIDAR sweeps, 1.4M RADAR sweeps, and 1.4M object bounding boxes in 40k keyframes

Perception: 3D Detection and Tracking; Prediction

1200h (Boston, Pittsburgh, Las Vegas and Singapore) + 838 (Las Vegas). 2D high definition maps. The states of all traffic lights are estimated. Python Toolkit is provided.

Motion Planning, Motion Prediction

1: 3D Tracking Dataset with 113 3D annotated scenes;

2: Sensor Dataset with 1,000 3D annotated scenarios (lidar, ring camera, and stereo sensor data), Lidar Dataset with 20,000 unlabeled scenarios

Perception: 3D Tracking;

Motion Forecasting

Image (video) with annotation;

133K images

Perception:

Semantic segmentation;

Lane detection

Video with semantic annotation;

>140K images (video frames)

Perception:

Semantic segmentation;

Object & Lane detection

Image & video with annotation;

Two image sets:7K and 5K

Perception:

Semantic segmentation;

Lane detection

Video, LiDAR, GPS;

10 videos

Perception:

Semantic segmentation;

Lane detection

LiDAR and stereo images with various position sensors targeting a highly complex urban environment;

tar.gz

SLAM; Odometry

Image (Traffic sign) with annotation

Perception: Object Detection

700+ images; 10+ minutes of high quality 30Hz footage with corresponding semantically labeled images at 1Hz and in part, 15Hz

Perception: Segmentation & Recognition

Video with annotation (bounding box, behavioral label);

347 videos, 170GB

Perception:

Object Detection;

Behaviour Analysis

Video with behavioral label, GPS, vehicle data;

35 videos

Behavior analysis

Video, LiDAR, GPS, vehicle with annotation (bounding box);

300GB

Perception:

Object detection, Object tracking:

End2End learning;

Imitation learning

Video (image) with annotation (vehicle and traffic sign);

3 (vehicle) + several (traffic sign) videos

Perception: Object Detection;

Video (image) with annotation;

5,000 (manual) + 20,100 (semi-auto) frames

Perception:

Object Detection, Semantic Segmentation;

Imitation learning

Image(video), LiDAR, with Semantic and Point cloud Segmentation, 3D bounding;

41,280 (image) +

12,499 (3D) + 390,000 (unlabeled sensor) frames

Perception:

Object Detection, Object Tracking;

End2End Learning;

Imitation Learning

25,000 high-resolution images;

124 semantic object categories;

100 instance-annotated categories;

Global reach, covering 6 continents

Perception:

Street-level Instance Segmentation

56,000 camera images; 7,000 LiDAR sweeps; 75 scenes of 50-100 frames each 10 annotation classes;

Full sensor suite: 1 LiDAR, 8 Cameras, Post-processed GPS/IMU;

Adverse weather conditions (snow)

Perception:

(3D) Object Detection, Object Tracking;

Trajectory Prediction

LiDAR point cloud data; LAS, XML, SHP

Perception:

(3D) Object Detection

1 year, 1000 km; 20 million images along with LIDAR, GPS, and INS ground truth

Perception:

Object Detection, Object Tracking;

Dense Reconstruction;

Localization

Two vehicle cooperation simultaneously in the same location, 410 km of the driving area, 20K LiDAR, 40K RGB, and 240K annotated 3D bounding boxes across 5 vehicle classes

Perception: Vehicle-to-Vehicle Cooperative Perception; (3D) Object Detection, Tracking, Prediction, Localization;

Sim2Real Transfer Learning

Basic safety messages (BSM), vehicle trajectories, and various driver-vehicle interaction data; CSV format

Interactive Behaviour Extraction & Analysis;

Safety Analysis;

Driving Anomaly Detection

Roundabout: 10479 trajectories, 365 mins; Unsignalized Intersection: 14867 trajectories, 433 mins; Lane change: 10933 trajectories,

133 mins; Signalized intersection: 3775 trajectories, 60 mins; High definition maps in lanelet2 format

Intention/Behaviour/Motion Prediction;

Imitation Learning;

Reinforcement Learning;

Interactive Behaviour Extraction & Analysis

Last updated