Solving the Cosmic Mystery of Spaceship Titanic with Data Science


In the year 2912, our technological advancements have taken us far beyond the confines of Earth, leading us to explore and colonize distant exoplanets. However, with great exploration comes new challenges, and today, we face a cosmic mystery that demands our best data science skills.

The Incident

A month ago, the Spaceship Titanic, an interstellar passenger liner, embarked on its maiden voyage. Its mission was to transport nearly 13,000 emigrants from our solar system to three newly habitable exoplanets orbiting nearby stars. The journey took an unexpected turn when, while rounding Alpha Centauri, the spaceship collided with a spacetime anomaly hidden within a dust cloud. This collision, eerily reminiscent of its Earth-bound namesake’s fate, resulted in nearly half of its passengers being transported to an alternate dimension.

The Challenge

The Spaceship Titanic’s tragedy has led to an unprecedented challenge: predicting which passengers were transported by the anomaly. To assist in this mission, we’ve retrieved personal records from the ship’s damaged computer system, providing us with crucial data to analyze.

The Approach

To tackle this challenge, I employed a variety of data science techniques. The process began with the essential task of data preprocessing, where I handled missing values and encoded categorical variables. I used Python, along with libraries like pandas and sklearn, to cleanse and prepare the data for analysis.

Next, I chose the RandomForestClassifier model for its effectiveness in handling both categorical and numerical data. Through a carefully constructed pipeline, I trained the model on the available passenger data, identifying patterns and correlations that could predict the likelihood of a passenger being transported to the alternate dimension.

Predicting the Unseen

The final step was to apply this model to unseen test data, predicting the fate of each passenger. By feeding the test data through the same preprocessing steps and then using my trained model, I generated predictions that could potentially aid rescue operations.

Creating a Path for Rescue

The outcomes were compiled into a submission file, ready to be shared with the rescue crews. This file, a beacon of hope, could significantly increase the efficiency and success rate of the rescue missions, potentially saving thousands of lives.


This project is more than a showcase of data science skills; it’s a testament to how technology can be harnessed to solve not just earthly challenges but cosmic mysteries too. As I continue to reach for the stars, the interplay between technology and human ingenuity will remain my greatest asset in navigating the unknown frontiers of space.

Take a look behind the scenes of my daily work, discover interesting facts from the world of data and the latest news about ITG