If I have this minimal reproducible example
import pandas as pd
df = pd.DataFrame({"A":[12, 4, 5, None, 1],
"B":[7, 2, 54, 3, None],
"C":[20, 16, 11, 3, 8],
"D":[14, 3, None, 2, 6]})
index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']
df.index = index_
print(df)
# Option 1
result = df[['A', 'D']]
print(result)
# Option 2
result = df.loc[:, ['A', 'D']]
print(result)
What is the effect on using loc or not. The results are quite similar.
I ask this in preparation for a more complex question in which I have been instructed to use loc.
>Solution :
The difference is that df[['A', 'D']] doesn’t necessarily generate a copy, which can trigger issues if you assign data to the slice.
result1 = df[['A', 'D']]
print(result1._is_copy)
#<weakref at 0x7f34261b69d0; to 'DataFrame' at 0x7f34260e9590>
result2 = df.loc[:, ['A', 'D']]
print(result2._is_copy)
# None
df.loc[:, ['A', 'D']] is the safe way to generate a copy if you want an independent slice.