Member-only story

How to handle Null values using Python…

Iqra Naeem
5 min readDec 14, 2020

--

Missing Data can occur when no information is provided for one or more items or for a whole. In real world missing data is a big problem. It refer to as NA(Not Available) values in pandas. In DataFrame sometimes many datasets simply arrive with missing data, either because it exists and was not collected or it never existed. For Example, Suppose different user being surveyed may choose not to share their address, some user may choose not to share the address in this way many datasets went missing.

Dataset:

We are going to use dataset Titanic — Machine Learning from Disaster from Kaggle.

import pandas as pd
train = pd.read_csv("../input/train.csv")

Identify The Missing Columns:

Missing_Percentage = df.isnull().sum() * 100 / len(df)
Missing_Value = pd.DataFrame({'column_Name': df.columns,
'Missing_Values':df.isnull().sum(),
'Missing_Percentage': Missing_Percentage })
Missing_Value.sort_values('Missing_Percentage',ascending=[False], inplace=True)
Missing_Value

--

--

Iqra Naeem
Iqra Naeem

Written by Iqra Naeem

Machine Learning | Data Science | Web Development

No responses yet

Write a response