تقييمات الطلاب
( 5 من 5 )
١ تقييمات
فيديو شرح Outlier detection and removal: z score, standard deviation Feature engineering tutorial python # 3 ضمن كورس لغة الآلة شرح قناة codebasics، الفديو رقم 41 مجانى معتمد اونلاين
If we have a dataset that follows normal distribution than we can use 3 or more standard deviation to spot outliers in the dataset. Many times these are legitimate values and it really depends on the situation if you want to remove them or not. But removing outliers can significantly increase the statistical power of machine learning model hence it is recommended that you treat outliers before building a model. Z score indicates how many standard deviation away a given sample is. We are going to go through all this theory and write python code to remove outliers from heights dataset that I have taken it from kaggle.
Link for kaggle dataset: https://www.kaggle.com/mustafaali96/weight-height
Code & Exercise: https://github.com/codebasics/py/blob/master/ML/FeatureEngineering/2_outliers_z_score/2_outliers_z_score.ipynb
CSV file for exercise: https://github.com/codebasics/py/tree/master/ML/FeatureEngineering/2_outliers_z_score/Exercise
Topics
00:00 Introduction
00:20 Exploratory analysis on a kaggle dataset
01:14 Plot histogram and bell curve
06:30 Use 3 standard deviation to remove outliers
12:14 Use Z score to remove outliers
17:39 Exercise
Do you want to learn technology from me? Check https://codebasics.io/ for my affordable video courses.
Website: https://codebasics.io/
Facebook: https://www.facebook.com/codebasicshub
Twitter: https://twitter.com/codebasicshub