Difference Between Shallow copy VS Deep copy in Pandas Dataframes
The pandas library has mainly two data structures DataFrames and Series. These data structures are internally represented with index arrays, which label the data, and data arrays, which contain the actual data. Now, when we try to copy these data structures (DataFrames and Series) we essentially copy the object’s indices and data and there are two ways to do so, namely Shallow Copy and Deep Copy.
These operations are done with the help of the library functions pandas.DataFrame.copy(deep=False) for shallow copy and pandas.DataFrame.copy(deep=True) for deep copy in DataFrames and Series.
Now, let’s understand what shallow copying is.
When a shallow copy of a DataFrame or Series object is created, it doesn’t copy the indices and the data of the original object but it simply copies the references to its indices and data. As a result of which, a change made to one is reflected in the other one.
It refers to constructing a new collection object and then populating it with references to the child objects found in the original. The copying process does not recurse and therefore won’t create copies of the child objects themselves.
As we can see from the output of the above program, the changes applied to the shallow copied data frame gets automatically applied to the original dataframe.
A deep copy of a DataFrame or a Series object has its own copy of index and data. It is a process in which the copying process occurs recursively. It means first constructing a new collection object and then recursively populating it with copies of the child objects found in the original. In the case of deep copy, a copy of an object is copied into another object. It means that any changes made to a copy of the object do not reflect in the original object.
Here, the data inside the original objects are not recursively copied. That is, the data inside the data of the original objects still point to the same memory unit. For example, if the data in a Dataframe or Series object contains any mutable data then it will be shared between it and its deep copy and any modification to one will be reflected in the other one.
Table of Difference Between Shallow Copy V/S Deep Copy
|Sr no.||Shallow Copy||Deep Copy|
|1||It is the copy of the collection structure, not the elements.||It is the copy of the collections with all the elements in the original collection duplicated.|
|2||Affects the initial dataframe.||Does not affect the initial dataframe.|
|3||Shallow copy doesn’t replicate child objects.||Deep copy replicates child objects recursively.|
|4||Creating a shallow copy is fast as compared to deep copy.||Creating a deep copy is slow as compare to shallow copy.|
|5||The copy is dependent on the original||The copy is not fully dependent on the original.|