Pre-recorded question

Question #144
How to remove duplicate rows in a dataframe with Python/pandas ?
Answer

The best way to remove duplicate rows in a Pandas dataframe is to use the method drop_duplicates():

import pandas as pd 
df = pd.DataFrame([[0, 1], [2, 3], [2, 3], [2, 4]], columns = ['Col 1', 'Col 2']) 

# Remove duplicate rows
df.drop_duplicates(keep = 'first', inplace = True)


Before :

   Col 1  Col 2
0      0      1
1      2      3
2      2      3
3      2      4

After :

   Col 1  Col 2
0      0      1
1      2      3
3      2      4


2 events in history
Answer by Alphonsio 08/27/2020 at 01:27:04 PM

The best way to remove duplicate rows in a Pandas dataframe is to use the method drop_duplicates():

import pandas as pd 
df = pd.DataFrame([[0, 1], [2, 3], [2, 3], [2, 4]], columns = ['Col 1', 'Col 2']) 

# Remove duplicate rows
df.drop_duplicates(keep = 'first', inplace = True)


Before :

   Col 1  Col 2
0      0      1
1      2      3
2      2      3
3      2      4

After :

   Col 1  Col 2
0      0      1
1      2      3
3      2      4


Question by Alphonsio 08/27/2020 at 01:22:17 PM
How to remove duplicate rows in a dataframe with Python/pandas ?
# ID Query URL Count

Icons proudly provided by Friconix.