We collect Titanic passenger data stored in a CSV file.
Every row shows one person, and every column describes them.
The file contains:
• Age
• Sex
• Ticket class (1, 2, or 3)
• Price paid
• Where they got on the ship
• Number of family members
• Whether they survived
Some details are missing—like cabin numbers or age for some passengers—but the dataset still gives us enough information to study survival.
We carefully examine the data to understand patterns.
We count, compare, and make small graphs to see what stands out.
We notice:
• More women survived than men
• Younger passengers had a better chance
• First-class passengers survived more than third-class
• Some ages are missing and need special care
We also clean the data by fixing missing values and choosing which columns are useful.
This step helps us understand the story hidden in the numbers.
We split the data into two groups:
• Training data — used to teach the computer
• Testing data — used to check learning
We use a model such as logistic regression.During training, the model looks at the passengers’ ages, class, and sex to learn which patterns suggest survival.
For example:
• Women usually survived
• First-class passengers had higher survival
• Babies had a greater chance
The model learns these rules and gets ready to predict.
After training, we test the model with new data. We check how many predictions are correct — this is the accuracy. If accuracy is low, we adjust the model, clean more data, or try new methods. If accuracy is good, the model is ready to predict whether a new passenger would survive. This final step makes sure our AI is dependable and helpful. In my code the highest accuracy was 78% or 0.78