Car Crash Data Project
Why
I’ve noticed local news frequently report statistics about car crashes in my city, often claiming that "90% of accidents involve taxis." While this may be possible—since taxis are heavily used here—I’m skeptical of how accurate this number is. These assumptions impact public perception and I want to see if the data really supports this claim. That’s why I’m building this project: to gather, analyze, and verify real data to understand if taxis are truly responsible for such a high percentage of accidents.
How
I could simply use an Excel file to store all the data, but that wouldn’t teach me anything new. Instead, I want to challenge myself and learn more about databases and data analysis. Here are the steps I plan to follow:
Phase 1: Database Setup and Basic Analytics
☐ Create a database in SQL Server
☐ Write a Python script to load data
☐ Create a draft in Power BI
Phase 2: Advanced Data Population
☐ Incorporate AI for data auto-fill
☐ Migrate the database to Azure
☐ Update Power BI Report
Phase 3: Real-World Applications and Collaboration
☐ Try to collaborate with local authorities
☐ Publish an anonymized version on kaggle
Key Consideration
Data Source
For this project, my primary data source will be Facebook.