In the realm of Python data structures, sets are a fundamental and powerful tool that offer unique features and capabilities. A set is an unordered collection of distinct elements, which means there are no duplicate values allowed. This blog post will delve into the world of Python sets, exploring their characteristics, operations, and common use cases.
Set Storage in Python
In the realm of Python data structures, sets are a fundamental and powerful tool that offer unique features and capabilities. A set is an unordered collection of distinct elements, which means there are no duplicate values allowed. This blog post will delve into the world of Python sets, exploring their characteristics, operations, and common use cases.
Here’s an overview of how sets are stored in Python using a hash table:
- Hashing: When you add an element to a set, Python first computes a unique hash value for that element using a hash function. This hash value is a numerical representation of the element’s contents.
- Hash Table Structure: Python uses the computed hash values to determine where to store elements in memory. The hash table is an internal data structure that consists of an array of “buckets” or “slots.”
- Mapping Elements to Buckets: Each element’s hash value is used to determine the appropriate bucket in the array. The hash value acts as an index that points to the corresponding bucket.
- Handling Hash Collisions: Since multiple elements may have the same hash value (a situation known as a hash collision), Python uses strategies like chaining or open addressing to handle collisions and ensure all elements are stored correctly.
- Insertion and Retrieval: When you add an element to the set, Python computes its hash value and determines the appropriate bucket. If there’s no collision, the element is placed in that bucket. When you want to retrieve an element, Python computes the hash value again, locates the correct bucket, and retrieves the element from there.
The use of hash tables for sets ensures that operations like adding, removing, and checking for membership are efficient and have an average time complexity of O(1) under ideal conditions. However, the actual performance can degrade in the presence of hash collisions or other factors that affect the hash function’s distribution.
It’s important to note that sets are unordered collections, meaning the order of elements in a set is not guaranteed and can change as elements are added or removed. This is a consequence of the hash table’s internal structure and the way elements are stored based on their hash values.
1. Introduction to Sets
In Python, a set is an unordered collection of unique elements. It is a built-in data type provided by Python and is highly useful when you want to store multiple items in a single variable while ensuring uniqueness.
Characteristics of Sets
- Unordered Collection:
- Sets are unordered, meaning the order of elements is not guaranteed. Therefore, sets do not support indexing or slicing like lists or tuples.
- Unique Elements:
- Sets only contain unique elements. If you try to add a duplicate element, it won’t be added again.
- Immutable Elements:
- Sets can only contain immutable (hashable) types. Mutable types like lists cannot be elements of a set.
2. Creating Sets
You can create a set using curly braces {} and separating elements with commas. To create an empty set, use the set() function:
empty_set = set()
my_set = {1, 2, 3}
3. Set Operations
a. Iteration
my_set = {1, 2, 3, 4, 5}
for element in my_set:
print(element)
1
2
3
4
5
b. Comparison Operators
- Equality (
==):- Checks if two sets are equal, i.e., they have the same elements.
- Inequality (
!=):- Checks if two sets are not equal, i.e., they have different elements.
- Subset (
<=orissubset()):- Checks if all elements of one set are in another set.
- Proper Subset (
<):- Checks if one set is a proper subset of another (subset but not equal).
- Superset (
>=orissuperset()):- Checks if all elements of one set contain another set.
- Proper Superset (
>):- Checks if one set is a proper superset of another (superset but not equal).
set1 = {1, 2, 3}
set2 = {3, 2, 1}
print(set1 == set2) # Output: True
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print(set1 != set2) # Output: True
set1 = {1, 2}
set2 = {1, 2, 3, 4}
print(set1 <= set2) # Output: True (set1 is a subset of set2)
set1 = {1, 2}
set2 = {1, 2, 3, 4}
print(set1 < set2) # Output: True (set1 is a proper subset of set2)
set1 = {1, 2, 3, 4}
set2 = {1, 2}
print(set1 >= set2) # Output: True (set1 is a superset of set2)
set1 = {1, 2, 3, 4}
set2 = {1, 2}
print(set1 > set2) # Output: True (set1 is a proper superset of set2)
4. Set Functions
- len()
len()Returns the number of elements in a list.
- max():
max()Returns the maximum element in a list.
- min():
min()Returns the minimum element in a list.
- sum():
sum()Returns the sum of all elements in a list.
my_set = {1, 2, 3, 4, 5}
print(len(my_set)) # Output: 5
my_set = {10, 5, 20, 15, 30}
print(min(my_set)) # Output: 5
my_set = {10, 5, 20, 15, 30}
print(max(my_set)) # Output: 30
my_set = {1, 2, 3, 4, 5}
print(sum(my_set)) # Output: 15 (1 + 2 + 3 + 4 + 5)
5
5
30
15
5. Set Methods
- add():
- Adds an element to the set.
- update(iterable)
- Adds multiple elements to the set
- remove():
- Removes a specified element from the set. Raises a
KeyErrorif the element doesn’t exist.
- Removes a specified element from the set. Raises a
- discard():
- Removes a specified element from the set. Does not raise an error if the element doesn’t exist.
- pop():
- Removes and returns an arbitrary element from the set.
- clear():
- Removes all elements from the set, making it an empty set.
- copy():
- Returns a shallow copy of the set.
- union():
- Returns a new set containing all the unique elements from both sets.
- intersection():
- Returns a new set containing common elements from both sets.
- difference():
- Returns a new set containing elements that are in the set but not in the specified set(s).
- symmetric_difference():
- Returns a new set containing elements that are in either set, but not in both.
- isdisjoint():
- Returns
Trueif two sets have no common elements; otherwise,False.
- Returns
- issubset():
- Returns
Trueif all elements of the set are present in the specified set; otherwise,False.
- Returns
- issuperset():
- Returns
Trueif the set contains all elements of the specified set; otherwise,False.
- Returns
# add()
my_set = {1, 2, 3}
my_set.add(4)
print(my_set) # Output: {1, 2, 3, 4}
#update
my_set = {1, 2, 3}
my_set.update([4,5,6)
print(my_set) # Output: {1, 2, 3, 4, 5, 6}
# remove()
my_set = {1, 2, 3}
my_set.remove(2)
print(my_set) # Output: {1, 3}
# discard()
my_set = {1, 2, 3}
my_set.discard(2)
print(my_set) # Output: {1, 3}
# pop()
my_set = {1, 2, 3}
popped_element = my_set.pop()
print(popped_element) # Output: An arbitrary element
print(my_set) # Output: Set with the element removed
# clear()
my_set = {1, 2, 3}
my_set.clear()
print(my_set) # Output: set()
# copy()
my_set = {1, 2, 3}
copy_set = my_set.copy()
print(copy_set) # Output: {1, 2, 3}
# union()
set1 = {1, 2, 3}
set2 = {3, 4, 5}
union_set = set1.union(set2)
print(union_set) # Output: {1, 2, 3, 4, 5}
# intersection()
set1 = {1, 2, 3}
set2 = {3, 4, 5}
intersection_set = set1.intersection(set2)
print(intersection_set) # Output: {3}
# difference()
set1 = {1, 2, 3}
set2 = {3, 4, 5}
difference_set = set1.difference(set2)
print(difference_set) # Output: {1, 2}
# symmetric_difference()
set1 = {1, 2, 3}
set2 = {3, 4, 5}
symmetric_diff_set = set1.symmetric_difference(set2)
print(symmetric_diff_set) # Output: {1, 2, 4, 5}
# isdisjoint()
set1 = {1, 2, 3}
set2 = {4, 5, 6}
print(set1.isdisjoint(set2)) # Output: True (No common elements)
# issubset()
set1 = {1, 2}
set2 = {1, 2, 3, 4}
print(set1.issubset(set2)) # Output: True (set1 is a subset of set2)
# issuperset()
set1 = {1, 2, 3, 4}
set2 = {1, 2}
print(set1.issuperset(set2)) # Output: True (set1 is a superset of set2)
{1,2,3,4}
{1,2,3,4,5,6}
{1,3}
{1,3}
3
{1,2}
set()
{1,2,3}
{1,2,3,4,5}
{3}
{1,2}
{1,2,4,5}
True
True
True
6. Set Comprehension
Set comprehension in Python allows you to create a new set by specifying an expression and an iterable. It iterates over the iterable, applies the expression to each item, and includes the result in the set. Set comprehension follows a similar syntax to list comprehension but produces a set instead.
#Creating a set of squares of numbers:
squares = {x * x for x in range(1, 6)}
print(squares) # Output: {1, 4, 9, 16, 25}
#Creating a set from a list with duplicates:
my_list = [1, 2, 3, 3, 4, 5, 5]
unique_numbers = {x for x in my_list}
print(unique_numbers) # Output: {1, 2, 3, 4, 5}
7. Mutable Nature of Sets & Immutable Nature of Set Elements
In Python, sets are mutable, meaning you can modify a set by adding or removing elements. However, the elements within a set must be immutable.
The mutable nature of sets allows you to perform various operations, such as adding new elements, removing elements, and updating the set. On the other hand, the elements contained within a set must be immutable. This ensures that the set’s internal structure remains consistent and efficient, as mutable elements could potentially change their hash values, leading to inconsistencies in the set’s behavior.
Immutable elements include integers, floats, strings, tuples, and frozensets. For instance, you can add tuples (immutable) to a set, but not lists (mutable)
A set cannot be an element of another set because sets are mutable and therefore not hashable, which is a requirement for elements in a set.
To have a set-like structure that can be an element of another set, you can use a frozenset. A frozenset is an immutable set, meaning its elements cannot be modified once assigned, and it can be used as an element in another set or as a dictionary key.
Here’s an example demonstrating the use of frozenset:
frozen_set1 = frozenset([1, 2, 3])
frozen_set2 = frozenset([2, 3, 4])
# Using frozensets as elements in a set
set_of_frozensets = {frozen_set1, frozen_set2}
print(set_of_frozensets) # Output: {frozenset({1, 2, 3}), frozenset({2, 3, 4})}
# Attempting to use a mutable set as an element (will raise an error)
# set_of_sets = {set([1, 2, 3])} # This will raise an error