Project Overview
Objective
The project aimed to evaluate and enhance the state of assistance services for disabled individuals across the French SNCF (Société nationale des chemins de fer français) railway network. With over 3,000 stations to manage, the SNCF had a considerable challenge in ensuring that the varying needs of passengers with reduced mobility were met efficiently and effectively.

Context
This study was conducted against a backdrop of growing awareness and regulatory requirements for accessibility in public transport. Between 2015 and 2022, the SNCF welcomed an increase in disabled passengers, prompting an urgent need for a thorough analysis and improvement of services offered.
Duration
The project spanned several months, encompassing data collection, analysis, and strategy formulation.
Role
As the lead analyst, I was responsible for data analysis, identifying key trends, and proposing actionable recommendations. The project involved collaborating with various stakeholders, including station managers, accessibility advocates, and passenger groups.
Tools and Methodologies
Data analysis was conducted using Python for statistical computations and Tableau for visualizations. The methodologies included correlation analysis, linear regression, and cluster analysis to identify patterns and draw insights.
The Approach and Process
Data Collection
The first step was gathering quantitative data on station accessibility features, passenger numbers, and the volume of assistance provided. This data was sourced from SNCF's internal records.
Exploratory Analysis
The data was initially explored using a correlation matrix to understand the relationships between different types of assistance and passenger volumes. This revealed strong internal correlations between various assistance services, which was expected.
# Create the correlation matrix corr_matrix = numeric_df.corr() # Create a subplot with matplotlib f, ax = plt.subplots(figsize=(10, 8)) # Create the heatmap sns.heatmap(corr_matrix, annot=False, ax=ax) # Define a threshold to distinguish dark and light cells threshold = corr_matrix.max().max() / 2 # Manually annotate each cell with the correlation coefficient for i in range(len(corr_matrix.columns)): for j in range(len(corr_matrix.columns)): value = corr_matrix.iloc[i, j] text_color = "white" if value < threshold else "black" ax.text(j+0.5, i+0.5, f"{value:.2f}", horizontalalignment='center', verticalalignment='center', color=text_color) plt.title('Correlation Matrix', fontsize=14) plt.show()

Linear Regression
A linear regression model tested the hypothesis that higher passenger numbers would correlate with an increase in the demand for assistance services. The model showed that total passenger numbers accounted for only 24% of the variance in assistance service data, indicating a moderate yet significant correlation.

Cluster Analysis
To capture the complexities of the data, cluster analysis was employed, revealing four distinct groups with varying passenger volumes and assistance provided. This segmentation allowed for a nuanced understanding of the interplay between passenger numbers and assistance needs.

Challenges and Solutions
One significant challenge was the outlier data presented by extremely busy stations, such as Paris Nord Grandes Lignes, which did not follow the general trend. Addressing this involved tailored recommendations for high-traffic stations.
Visualizations and Insights
Maps and scatterplots visualized geographic distributions and the relationship between passenger volumes and assistance provided. These visuals highlighted the need for differentiated strategies across regions and station clusters.


End Results and Recommendations
Strategic Recommendations
- Prioritize High-Traffic Stations: Allocate resources to meet the high demand for assistance services.
- Improve Service Level in Intermediate Stations: Elevate service levels to ensure efficient assistance delivery.
- Expand Wheelchair Availability: Keep pace with passenger volume to address mobility assistance needs.
- Targeted Accessibility Upgrades: Assess the cost-benefit ratio of further upgrades in stations with high independent access levels.
Outcomes
The case study concluded with strategic recommendations to enhance the accessibility and service quality for disabled passengers. The insights gained were expected to guide the SNCF in prioritizing investments and improving the overall passenger experience.
Reflections
The project underscored the importance of data-driven decision-making in public service provision. It also highlighted the need for continuous improvement and adaptation to meet the diverse needs of all passengers.
Next Steps
- Incorporate Qualitative Data: Integrate passenger feedback for a holistic service assessment.
- Monitor Implementation: Track changes and measure success using key performance indicators.
- Iterative Improvement: Refine services based on feedback and emerging needs.
Conclusion
This case study illustrated the analytical journey undertaken to understand and improve assistance for disabled people at French SNCF stations. It served as a testament to the power of data analytics in shaping public transportation policies and the commitment to creating an inclusive environment for all passengers.
For a deeper dive into the methodologies and data, the project's GitHub repository is available at GitHub Repository.