Jack C
Jul 19, 2021

--

Think there's an issue with your model here - your confusion matrix is showing that your model never identifies an email as being 'spam' (it never predicts any 1's).

In essence, your model is simply stating 'all emails aren't spam' - so whilst your accuracy is 74% it isn't learning any new information.

My best guess is that in your first bit of code, where you're saving 'cleaned_data.csv' that you've somehow saved a version of the data filtered for only non-spam emails and used that for training.

However, nice article overall :)

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Jack C
Jack C

Written by Jack C

I write about Data Analytics and Analytics Engineering

Responses (2)

Write a response