Skip to content

visionpilot-project/VisionPilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

212 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

VisionPilot Banner

VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception (BeamNG.tech)

Combined demo preview

Overview

A modular Python project for autonomous driving research and prototyping, fully integrated with the BeamNG.tech simulator and Foxglove visualization. This system combines traditional computer vision algorithms and deep learning (CNN, YOLO) with real-time sensor fusion and autonomous vehicle control to tackle:

  • Multi-Lane Detection: YOLOP, Traditional CV
  • Traffic Sign: Classification & Detection
  • Traffic Lights: Classification & Detection
  • Object Detection: Vehicles, pedestrians, cyclists and more
  • Multi-Sensor Fusion: Camera, Lidar, Radar, GPS, IMU
  • Real-Time Control: Model Predictive Control (MPC) for integrated steering & throttle optimization
  • Visualization: Real-time monitoring with Foxglove WebSocket + multiple CV windows
  • Configuration System: YAML-based modular settings

Table of Contents

System Architecture & Data Flow

The following diagram illustrates the complete data flow from the simulation environment through perception, control, and final actuation:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     BeamNG.tech Simulation                      β”‚
β”‚  (Camera, Lidar, Radar, GPS, IMU, Vehicle Speed, Orientation)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚ Sensor Data Stream
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      Perception Layer                           β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ CV Lane Det. β”‚   β”‚  YOLOP Model  β”‚   β”‚ Object Detection  β”‚   β”‚
β”‚  β”‚ (Lane Center,β”‚ + β”‚ (Segmentation,β”‚ + β”‚ (Vehicles, Signs, β”‚   β”‚
β”‚  β”‚  Deviation)  β”‚   β”‚  Drivable)    β”‚   β”‚  Traffic Lights)  β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚ Waypoints, Lane Metrics, Obstacles
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Planning & Control Layer                       β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚              Model Predictive Control (MPC)               β”‚  β”‚
β”‚  β”‚  - Plans 1-sec trajectory based on exact vehicle state    β”‚  β”‚
β”‚  β”‚  - Optimizes smooth Steering + Throttle simultaneously    β”‚  β”‚
β”‚  β”‚  - Constrained by vehicle dynamics & physical bounds      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚ Proposed Control (Steering, Throttle)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Active Safety Layer                         β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚             Automatic Emergency Braking (AEB)             β”‚  β”‚
β”‚  β”‚  - Monitors continuous Radar TTC (Time-To-Collision)      β”‚  β”‚
β”‚  β”‚  - Overrides MPC throttle limits if collision is imminent β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚ Final Actuated Commands
                                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Vehicle Control (BeamNG.tech)                   β”‚
β”‚                 (Steering, Throttle, Braking)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Demos

Multi-Lane Detection

Evaluation of the multi-lane perception pipeline across various environmental edge cases, including high-glare transitions, low-light tunnels, and heavy atmospheric fog:

AEB Demo

Extended Demo: Watch the full video here


Emergency Braking (AEB) Demo

Watch the Emergency Braking System (AEB) in action with real-time radar filtering and collision avoidance:

AEB Demo

Extended Demo: Watch the full video here


Blind Spot Detection (BSD) Demo

See the Blind Spot Detection (BSD) system in action using radar data to identify vehicles in the blind spot: Blind Spot Detection Demo Extended Demo: Watch the full video here


Sign Detection & Detection and classification

This demo shows real-time traffic sign detection and classification:

Sign Detection Demo & Vehicle Pedestrian

Extended Demo: Watch the full video here

VisionPilot does not yet support multi-camera. This is for demonstration purposes only.


Traffic Light Detection & Classification Demo

This demo shows real-time traffic light detection and classification:

Traffic Light Detection & Classification Demo

No extended Demo avaliable yet.


Latest Lane Detection Demo (v2)

Watch the improved autonomous lane keeping demo (v2) in BeamNG.tech, featuring smoother fused CV+SCNN lane detection, stable PID steering, and robust cruise control:

Lane Detection Demo

Extended Demo: Watch the full video here

Note: Very low-light (tunnel) scenarios are not yet supported.

Previous Lane Detection Demo (v1)

The original demo is still available for reference:

Lane Keeping & Multi-Model Detection Demo (v1)


YOLOP Lane Detection Demo

Watch both the raw model segmentation output and the multiple processed lanes on a highway video.

YOLOP Lane Detection Demo

Extended Demo: Watch the full video here

Note: This is not the final integration of the yolop model in VisionPilot. This only serves as a demo of the model's capabilities and use cases for VisionPilot.


Foxglove Visualization Demo

See real-time LiDAR point cloud streaming and autonomous vehicle telemetry in Foxglove Studio:

Foxglove Visualization Demo

Extended Demo: Watch the full video here


Segmentation Demo

See real-time image segmentation using front and rear cameras:

Segmentation Demo

Extended Demo: Watch the full video here


More demo videos and visualizations will be added as features are completed.

Sensor Suite

The vehicle is equipped with a comprehensive multi-sensor suite for autonomous perception and control:

Sensor Specification Purpose
Front Camera 1920x1080 @ 50Hz, 70Β° FOV, Depth enabled Lane detection, traffic signs, traffic lights, object detection
LiDAR (Top) 80 vertical lines, 360Β° horizontal, 120m range, 20Hz Obstacle detection, 3D scene understanding
Front Radar 200m range, 128Γ—64 bins, 50Hz Collision avoidance, adaptive cruise control
Rear Left & Right Radar 30m range, 64Γ—32 bins, 50Hz Blindspot monitoring, rear object detection
Dual GPS Front & rear positioning @ 50Hz Localization
IMU 100Hz update rate Vehicle dynamics, pose estimation
Sensor Array 1 Sensor Array 2 Sensor Array 3
Sensor Array Front Radar Lidar Visualization

Configuration files are located in the /config directory:

Control Architecture Evolution: PID β†’ MPC

VisionPilot has transitioned from traditional PID-based control to advanced Model Predictive Control (MPC) for superior performance:

Aspect PID Control (Legacy) MPC Control (Current)
Strategy Reactive (error-based) Predictive (horizon-based)
Prediction None - responds to current error Looks 1 second ahead
Steering Control Separate lateral controller Integrated optimization
Throttle Control Separate cruise control Integrated optimization
Decision Making Independent (steering β‰  throttle) Simultaneous & coupled
Physics Awareness Limited Full vehicle dynamics model
Comfort May be jerky/oscillatory Smooth by design (cost-weighted)
Obstacle Handling Reactive braking only Proactive path planning
Computational Load Low (~1ms) Medium (~20ms)
Tuning Complexity High (multiple PIDs) Lower (cost matrices Q, R)

Why MPC?

MPC fundamentally changes how the vehicle makes decisions:

  • Looks ahead: Plans the next 1 second of motion
  • Optimizes together: Steering and throttle decisions are made simultaneously, respecting vehicle physics
  • Respects constraints: Hard physical limits (steering angle, acceleration) are built in
  • Smooth control: The cost function naturally penalizes jerky inputs

Layered Safety Architecture: MPC + AEB

VisionPilot uses a two-layer safety approach that combines proactive planning with reactive fallback:

How it works:

  1. MPC computes optimal control considering lane following and smooth acceleration
  2. AEB monitors radar for imminent collisions:
    • TTC ≀ 1.0s: Emergency brake (throttle = 0, acts as safety net)
    • TTC ≀ 2.5s: Reduce throttle to 50% (MPC still controls steering for avoidance)

Benefits:

  • Proactive: MPC plans around obstacles smoothly
  • Reactive: AEB catches any collision MPC didn't anticipate
  • Robust: Defense-in-depth approach reduces crash risk
  • Future-proof: When Lidar 3D detection is integrated into MPC obstacles, AEB becomes rarely triggered

Roadmap

Perception

  • Sign classification & Detection (CNN / YOLO)
  • Traffic light classification & Detection (CNN / YOLO)
  • Lane detection Fusion (YOLOP / CV)
  • πŸ”₯πŸ”₯ YOLOP integration
    • Drivable area segmentation
    • Lane detection (segmentation output)
    • Object detection
  • CV Lane Detection (Traditional Computer Vision)
  • Integrate Majority Voting system for CV
  • Lighting Condition Detection
  • Real-Time Object Detection (Cars, Trucks, Buses, Pedestrians, Cyclists)
  • πŸ”₯πŸ”₯ Speed Estimation using detection from camera and lidar
    • Multiple Object Tracking (MOT)
  • πŸ”₯πŸ”₯ Handle dashed lines better in lane detection
  • Road Marking Detection (Arrows, Crosswalks, Stop Lines)
  • πŸ”₯πŸ”₯πŸ”₯ Lidar Object Detection 3D
  • πŸ’€ Ocluded Object Detection (Detect objects that are partially blocked or not visible in the camera view using radar/lidar)
  • Detect multiple lanes
  • πŸ’€ Multi Camera Setup (Will implement after all other camera-based features are finished)
  • πŸ’€ Overtaking, Merging (Will be part of Path Planning)

Sensor Fusion & Calibration

  • Kalman Filtering
    • Extended
  • Integrate Radar
  • Integrate Lidar
  • Integrate GPS
  • Integrate IMU
  • Ultrasonic Sensor Integration
  • πŸ’€πŸ’€ SLAM (simultaneous localization and mapping)
    • Build HD Map of the BeamNG.tech map
    • Localize Vehicle on HD Map

Control & Planning

  • Integrate vehicle control (Throttle, Steering, Braking Implemented) (PID needs further tuning)
  • Integrate PIDF controller
  • ⭐ Adaptive Cruise Control (Currently only basic Cruise Control implemented)
  • Automatic Emergency Braking AEB (Safety fallback layer for imminent collisions)
    • Obstacle Avoidance via MPC (Proactive path planning through constraint formulation)
  • πŸ”₯ Model Predictive Control MPC (Integrated with CasADi IPOPT solver, replaces PID control)
  • Curve Speed Optimization (Slow down for sharp curves based on lane curvature)
  • Trajectory Prediction for surrounding vehicles
  • πŸ”₯ Blindspot Monitoring (Using left/right rear short range radars)
  • Traffic Rule Enforcement (Stop at red lights, stop signs, yield signs)
  • Dynamic Target Speed based on Speed Limit Signs
  • Global Path planning
  • Local Path planning
  • πŸ”₯ Lane Change Logic (MPC)
    • Check Blindspots before lane change
    • Signal Lane Change
  • Parking Logic (Path finding / Parallel or Perpendicular)

Visualization & Logging

  • ⭐ Full Foxglove visualization integration (Overhaul needed)
  • Modular YAML configuration system
  • Real-time drive logging and telemetry
  • πŸ”₯ Birds eye view BEV (Top down view of vehicle and surroundings)
  • Real time Annotations Overlay in Foxglove
  • Show predicted trajectories in Foxglove
  • Show Global and local path plans in Foxglove
  • πŸ’€ Live Map Visualization

Note: Considering moving away from Foxglove entirely to build a custom dashboard. Not a priority at this time.

Deployment & Infrastructure

  • Containerize Models for easy deployment and scalability
    • ⭐ Microservices Architecture (Aggregator + individual services)
    • Message Broker (Redis support in docker-compose)
    • Docker Compose orchestration
    • Aggregator service (concurrent service orchestration)

README To-Dos

  • Add detailed documentation (Lane Det first)
  • Add demo images and videos to README
  • πŸ’€πŸ’€ Add performance benchmarks section
  • Add Table of Contents for easier navigation

Other

  • Vibe-Code a website for the project
  • Redo project structure for better modularity

Driver Monitoring System would've been pretty cool but human drivers are not implemented in BeamNG.tech or Carla

Legend

πŸ”₯ = High Priority

⭐ = Complete but still being improved/tuned/changed (not final version)

πŸ’€ = Minimal Priority, can be addressed later

πŸ’€πŸ’€ = Very Low Priority, may not be implement

Note on Installation

Status: This project is currently in active development. A stable, production-ready release with pre-trained models and complete documentation will be available eventually.

Known Limitations

  • Tunnel/Low-Light Scenarios: Camera perception fails below certain lighting thresholds
  • Multi-Camera Support: Single front-facing camera only (future roadmap)
  • PID Controller Tuning: May oscillate on tight curves
  • Real-World Testing: Only validated in simulation (BeamNG.tech), for now...

Credits

Datasets:

  • CU Lane, LISA, GTSRB, Mapillary, BDD100K

Simulation & Tools:

  • BeamNG.tech by BeamNG GmbH
  • Foxglove Studio for visualization
  • Docker & Docker Compose for containerization

Special Thanks:

  • Kaggle for free GPU resources (model training)
  • Mr. Pratt (teacher/supervisor) for guidance

Acknowledgements

Academic Papers & Research:

YOLOP/YOLOPX: Anchor-free multi-task learning network for panoptic driving perception

@article{YOLOPX2024,
  title={YOLOPX: Anchor-free multi-task learning network for panoptic driving perception},
  author={Zhan, Jiao and Luo, Yarong and Guo, Chi and Wu, Yejun and Liu, Jingnan},
  journal={Pattern Recognition},
  volume={148},
  pages={110152},
  year={2024}
}

MPC Controller: DRL-MPC

@misc{huang_drlmpc,
  author       = {ZITing Huang},
  title        = {DRL-MPC: Integrating Reinforcement Learning and Model Predictive Control for Enhancing Safety in Automated Vehicle Systems},
  year         = {2024},
  publisher    = {GitHub},
  journal      = {GitHub repository},
  url          = {https://github.com/ZITingHUANG1/DRL-MPC}
}

Citation

If you use VisionPilot in your project, please cite:

@software{visionpilot2026,
  title={VisionPilot: Autonomous Driving Simulation, Computer Vision & Real-Time Perception},
  author={Julian Stamm},
  year={2026},
  url={https://github.com/visionpilot-project/VisionPilot}
}

BeamNG.tech Citation

Title: BeamNG.tech
Author: BeamNG GmbH
Address: Bremen, Germany
Year: 2025
Version: 0.35.0.0
URL: https://www.beamng.tech/

License

This project is licensed under the MIT License - see LICENSE file for details.

About

Open-source modular autonomous driving simulation platform with computer vision, deep learning, and sensor fusion. Features lane detection, object recognition, various safety features, and adaptive control in BeamNG.tech, with real-time visualization via Foxglove.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors