Python Reminders¶

I assume that you all know Python. A brief introduction to Python Basics can be found in this notebook from last year (ipynb, html). Here we will only review some useful concepts

A Python List is a versatile, ordered collection of items created using square brackets []. It's one of Python's most powerful data structures because of three core features:

  1. Mutable (Dynamic Change): Lists are changeable. You can dynamically add, remove, or modify items after the list is created, allowing its contents to change during program execution.
  2. Iterable (Loopable): Lists are ordered and sequence-based, meaning you can easily traverse (loop through) every item, one by one, typically using a for loop.
  3. Heterogeneous (Mixed Types): A single list can hold items of different data types (numbers, strings, booleans, etc.) simultaneously.

image.png

List Comprehension¶

Recall the mathematical notation:

$$L_1 = \left\{x^2 : x \in \{0,1,\ldots ,9\}\right\}$$

$$L_2 = \left(1, 2, 4, 8,\ldots, 2^{12}\right)$$

In [1]:
L12 = []
for x in range(10):
    L12.append(x**2)
print(L12)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [2]:
L1 = [x**2 for x in range(10)]
L2 = [2**i for i in range(13)]
print(f"L1 = {L1}")
print(f"L2 = {L2}")
L1 = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
L2 = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096]

List comprehension with conditions

$$M = \left\{x \mid x \in L_1 \text{ if } x \text{ is even}\right\}$$

In [4]:
L3 = [x for x in L1 if x % 2 == 0]
print(f"L3 = {L3}")
L3 = [0, 4, 16, 36, 64]

Nested use of link comprehension

In [5]:
[x for x in [x**2 for x in range(10)] if x % 2 == 0]
Out[5]:
[0, 4, 16, 36, 64]

Use of list comprehension for string processing

In [6]:
words = 'The quick brown fox jumps over the lazy dog'.split()
print(words)
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
In [7]:
upper = [w.upper() for w in words]
print(upper)
['THE', 'QUICK', 'BROWN', 'FOX', 'JUMPS', 'OVER', 'THE', 'LAZY', 'DOG']
In [8]:
upper_lower = [[w.upper(), w.lower()] for w in words]
print(upper_lower)
[['THE', 'the'], ['QUICK', 'quick'], ['BROWN', 'brown'], ['FOX', 'fox'], ['JUMPS', 'jumps'], ['OVER', 'over'], ['THE', 'the'], ['LAZY', 'lazy'], ['DOG', 'dog']]
In [9]:
long_words = [x for x in words if len(x) > 3]
print(long_words)
['quick', 'brown', 'jumps', 'over', 'lazy']

Use list comprehension for obtaining input

In [10]:
s = input('Give numbers separated by comma: ')
x = [int(n) for n in s.split(',')]
print(x)
Give numbers separated by comma: 1,2,3,4
[1, 2, 3, 4]

Creating vectors and matrices

Create a vector of 10 zeros

In [11]:
z = [0 for i in range(10)]
print(z)
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Create a 10x10 matrix with all 0s¶

In [12]:
M = [[0 for i in range(10)] for j in range(10)]
M
Out[12]:
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

Set the diagonal to 1

In [13]:
for i in range(10): M[i][i] = 1
M
Out[13]:
[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
 [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]]

Create a list of 10 random integers in [0,99]

In [17]:
import random
R = [random.choice(range(100)) for i in range(10)]
print(R)
[87, 75, 57, 69, 62, 63, 8, 15, 32, 10]

Question: By default which distribution does random.choice chooses? (remember from last lesson in statistics)

Removing elements from a list

Removing elements from a list while you iterate through it can lead to problems...

In [19]:
L = [1,2,4,5,6,8]
for x in L:
    if x%2 == 0:
        L.remove(x)
print(L)
[1, 4, 5, 8]

Another way to do this using list comprehension:

We can create a new list with the entries we want to keep

In [20]:
L = [1,2,4,5,6,8]
L = [x for x in L if x%2 == 1] #creates a NEW list
print(L)
[1, 5]

Or create a list with the entries we want to remove

In [22]:
L = [1,2,4,5,6,8]
R = [y for y in L if y%2 == 0] # R = [2,4,6,8]
for x in R: L.remove(x)
print(L)
[1, 5]

Using a Dictionary in the list comprehension

image.png

In [24]:
D = {'A':1,'B':5,'C':4,'D':2}
print([x for x in D if D[x]>2])
['B', 'C']

Dicitonary Comprehension¶

We can create dictionaries in a similar way as with list comprehension

In [25]:
{str(i):i for i in [1,2,3,4,5]}
Out[25]:
{'1': 1, '2': 2, '3': 3, '4': 4, '5': 5}
In [26]:
fruits = ['apple', 'mango', 'banana','cherry']
fl = {f:len(f) for f in fruits}
fl
Out[26]:
{'apple': 5, 'mango': 5, 'banana': 6, 'cherry': 6}
In [27]:
f_dict = {f.capitalize():i for i,f in enumerate(fruits)}
print(f_dict)
{'Apple': 0, 'Mango': 1, 'Banana': 2, 'Cherry': 3}
In [28]:
{v:k for k,v in f_dict.items()}
Out[28]:
{0: 'Apple', 1: 'Mango', 2: 'Banana', 3: 'Cherry'}
In [29]:
f_dict2 = {i:f.capitalize() for i,f in enumerate(fruits)}
print(f_dict2)
{0: 'Apple', 1: 'Mango', 2: 'Banana', 3: 'Cherry'}

Using the right data structure¶

Using the right data structure makes a big difference when handling large data. Dictionaries and sets have expected constant time for finding an element, while lists have linear complexity. Even logarithmic time is significantly faster than linear.

Example

Looking for 100K integers in a collection of 100K integers

In [30]:
import time
import random
L = [random.choice(range(10000000)) for i in range(100000)] # the integers we are searching in a list
S = set(L) # and in a set
Q = [random.choice(range(10000000)) for i in range(100000)] # the list of integers we will look up (queries)
In [31]:
#Searching the set
start = time.time()
[x for x in Q if x in S]
end = time.time()
print(end-start)
0.015481948852539062
In [32]:
#Searching the list
start = time.time()
[x for x in Q if x in L]    # The ONLY change now is L instead of S
end = time.time()
print(end - start)
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/tmp/ipython-input-2079384362.py in <cell line: 0>()
      1 #Searching the list
      2 start = time.time()
----> 3 [x for x in Q if x in L]    # The ONLY change now is L instead of S
      4 end = time.time()
      5 print(end - start)

KeyboardInterrupt: 

Example

You are given a graph in the form of a collection of edges, that is, pairs of vertices

How do you store it in order to be able to quickly answer if there is an edge (x,y) in the graph, and also to get all the neighbors of node?

Create a dictionary with nodes as the keys, and sets with neighboring nodes as values

In [33]:
E = [(1,2),(2,3),(2,5),(2,6),(2,7),(3,4),(3,5),(5,6),(5,7),(7,8),(8,9),(8,10)]
G = {}
for (x,y) in E:
    if x not in G:
        G[x] = set()
    if y not in G:
        G[y] = set()
    G[x].add(y)
    G[y].add(x)
G
Out[33]:
{1: {2},
 2: {1, 3, 5, 6, 7},
 3: {2, 4, 5},
 5: {2, 3, 6, 7},
 6: {2, 5},
 7: {2, 5, 8},
 4: {3},
 8: {7, 9, 10},
 9: {8},
 10: {8}}

This code builds an Adjacency List to represent an undirected graph.

It converts the simple list of connections ($E$) into a dictionary ($G$) that shows the neighbors for every node:

  1. Input ($E$): A list of edges (connections), like (x, y).
  2. Output ($G$): The Adjacency List dictionary.
    • Keys are the graph nodes (vertices).
    • Values are sets of that node's neighbors.

The loop ensures that for every connection $(x, y)$, both $y$ is added to $x$'s neighbors and $x$ is added to $y$'s neighbors, making the graph undirected.

In [34]:
G[2]
Out[34]:
{1, 3, 5, 6, 7}

The Random library¶

We will often need to work with randomness. A library for this that is part of the main Python distribution is the random library.

Useful functions:

  1. seed: Allows you to repeat the same random choices in different experiments
  2. random: produces a random real number between 0 and 1
  3. randint: select int for a range
  4. choice: select an element from a list
  5. choices: select k elements form a list with replacement. It is possible to use weights
  6. shuffle: suffle a list of elements
  7. sample: sample k elements from a list

Example¶

How do I implement the following?

With probability 0.7 print 'A', with probability 0.2 print 'B', and with probability 0.1 do nothing.

In [47]:
import random
p = random.random()
if p <0.7: print('A')
elif p < 0.9: print('B')
A

Example¶

From a list I want to sample k elements where k is a parameter. I want my samples for the different k's to have the property that smaller samples are subsets of bigger samples. That is, the sample of size k+1 will contain one more random element to the sample of size k.

In [49]:
L = [i for i in range(20)]
random.shuffle(L)
sample4 = L[:4]
sample5 = L[:5]
In [50]:
print(sample4)
print(sample5)
[13, 18, 10, 2]
[13, 18, 10, 2, 12]