Open In App

Internal implementation of Data Structures in Python

Last Updated : 08 Sep, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Python provides a variety of built-in data structures, each with its own characteristics and internal implementations optimized for specific use cases.

In this article we are going to discuss about the most commonly used Data structures in Python and a brief overview of their internal implementations:

Data Structure

Internal Implementation

Static or Dynamic

Python List

Dynamic Arrays

Dynamic

Python Tuples

Arrays

Static

Python Sets

Hash Tables

Dynamic

Python Deque

Doubly linked list

Dynamic

Python Dictionary

Hash Tables

Dynamic

Python String

Array of Unicode characters

Dynamic

Python Array

Array

Static

Python HeapQ

Binary Trees

Dynamic

Python Hash maps

Hash Functions

Dynamic

Lists:

  • Lists are ordered collections of items, and their elements can be of different data types.
  • Internally, lists are implemented as dynamic arrays that can grow or shrink as needed. When the array is full, a new larger array is created, and elements are copied over.
  • This resizing operation is amortized, meaning it doesn’t happen every time an element is added.

Tuples:

  • Tuples are similar to lists but are immutable, meaning their elements cannot be changed after creation.
  • Internally, tuples are usually implemented as fixed-size arrays, and their immutability allows for some memory optimization and better performance in certain scenarios.

Sets:

  • Sets are unordered collections of unique elements.
  • Internally, sets are implemented using hash tables. Each element is hashed, and the hash value is used to quickly locate the element within the table.

Dictionaries:

  • Dictionaries (dicts) are collections of key-value pairs, where keys are unique and immutable.
  • Dicts are implemented using hash tables as well. Keys are hashed to determine their position in the table, and values are associated with these keys.

Strings:

  • Strings are sequences of characters.
  • Internally, Python strings are usually implemented as arrays of Unicode characters. The specific implementation details can vary depending on factors like Python version and compilation options.

Arrays:

  • Python provides an array module that allows you to create arrays with elements of a specific data type.
  • These arrays are more memory-efficient than lists when dealing with large quantities of homogeneous data.

Heap:

  • Python’s heapq module provides heap-related functions.
  • Heaps are binary trees that satisfy the heap property (parent nodes are smaller or larger than their children).

Collections from the collections module:

  • This module provides specialized data structures like defaultdict, Counter, OrderedDict, and deque.
  • They offer optimized implementations for specific use cases, such as default values, counting elements, ordered insertion, and double-ended queues.

Hash Maps:

  • Also known as hash tables or associative arrays.
  • Similar to dictionaries in Python, they store key-value pairs and offer fast access to values based on their keys.
  • Internally, they use a hash function to convert keys into indices in an array, allowing for efficient lookups.

Graphs:

  • Graphs consist of nodes (vertices) and edges connecting these nodes.
  • They can be implemented using various structures, such as adjacency matrices (2D arrays) or adjacency lists (lists of linked nodes).
  • Graphs are used to model relationships between entities.

Remember that the exact internal implementation details might vary between Python versions and implementations (like CPython, Jython, IronPython, etc.). The Python language abstracts these implementation details and provides a consistent interface for developers to work with these data structures.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads