Skip to content
Related Articles

Related Articles

Difference Between Shallow copy VS Deep copy in Pandas Dataframes
  • Last Updated : 02 Dec, 2020

The pandas library has mainly two data structures DataFrames and Series. These data structures are internally represented with index arrays, which label the data, and data arrays, which contain the actual data. Now, when we try to copy these data structures (DataFrames and Series) we essentially copy the object’s indices and data and there are two ways to do so, namely Shallow Copy and Deep Copy.

These operations are done with the help of the library functions pandas.DataFrame.copy(deep=False) for shallow copy and pandas.DataFrame.copy(deep=True) for deep copy in DataFrames and Series.

Now, let’s understand what shallow copying is.

Shallow Copy

When a shallow copy of a DataFrame or Series object is created, it doesn’t copy the indices and the data of the original object but it simply copies the references to its indices and data. As a result of which, a change made to one is reflected in the other one.

It refers to constructing a new collection object and then populating it with references to the child objects found in the original. The copying process does not recurse and therefore won’t create copies of the child objects themselves. 
Example:



Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# porgam to depict shallow copy
# in pandas dataframe
  
# import module
import pandas as pd
  
# assign dataframe
df = pd.DataFrame({'index': [1, 2, 3, 4],
                   'GFG': ['Mandy', 'Ron', 'Jacob', 'Bayek']})
  
  
# shallow copy
copydf = df.copy(deep=False)
  
# comparing shallow copied datframe
# and original dataframe
print('\nBefore Operation:\n', copydf == df)
  
# assignmnet operation
copydf['index'] = [0, 0, 0, 0]
  
  
# comparing shallow copied datframe
# and original dataframe
print('\nAfter Operation:\n', copydf == df)
  
print('\nOriginal Dataframe after operation:\n', df)

chevron_right


Output:

As we can see from the output of the above program, the changes applied to the shallow copied data frame gets automatically applied to the original dataframe.

Deep Copy

A deep copy of a DataFrame or a Series object has its own copy of index and data. It is a process in which the copying process occurs recursively. It means first constructing a new collection object and then recursively populating it with copies of the child objects found in the original. In the case of deep copy, a copy of an object is copied into another object. It means that any changes made to a copy of the object do not reflect in the original object. 

Example:



Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# porgam to depict deep copy
# in pandas dataframe
  
# import module
import pandas as pd
  
# assign dataframe
df = pd.DataFrame({'index': [1, 2, 3, 4],
                   'GFG': ['Mandy', 'Ron', 'Jacob', 'Bayek']})
  
# deep copy
copydf = df.copy(deep=True)
  
# comparing shallow copied datframe
# and original dataframe
print('\nBefore Operation:\n', copydf == df)
  
# assignmnet operation
copydf['index'] = [0, 0, 0, 0]
  
  
# comparing shallow copied datframe
# and original dataframe
print('\nAfter Operation:\n', copydf == df)
  
print('\nOriginal Dataframe after operation:\n', df)

chevron_right


Output:

Here, the data inside the original objects are not recursively copied. That is, the data inside the data of the original objects still point to the same memory unit. For example, if the data in a Dataframe or Series object contains any mutable data then it will be shared between it and its deep copy and any modification to one will be reflected in the other one. 

Table of Difference Between Shallow Copy V/S Deep Copy

Sr no. Shallow Copy Deep Copy
1 It is the copy of the collection structure, not the elements. It is the copy of the collections with all the elements in the original collection duplicated.
2 Affects the initial dataframe. Does not affect the initial dataframe.
3 Shallow copy doesn’t replicate child objects. Deep copy replicates child objects recursively.
4 Creating a shallow copy is fast as compared to deep copy. Creating a deep copy is slow as compare to shallow copy.
5 The copy is dependent on the original The copy is not fully dependent on the original.

 

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.

My Personal Notes arrow_drop_up
Recommended Articles
Page :