To 'open' a file in Python use the `open()` function. Depending on the arguments, this will either create a readable file object or a writable file object. You cannot both read from and write to a file simultaneously.

**NOTE:** These examples won't run (because the files they are referencing don't exist.) This is just the code I put on my slides so including them in order. Actual running examples are further below.

In [None]:
readableFile = open('inputFile.txt', 'r')

writableFile = open('outputFile.txt', 'w')

carefulWriteFile = open('outputFile.txt', 'x')

appendableFile = open('outputFile.txt', 'a')


In [None]:
test = open('beepboop.txt','w')
test.write('blahblah')
test.close()

test = open('beepboop.txt', 'x')
test.close()

When you 'open' a file, it is generally good practice to 'close' it. You can either do this manually after you are done or do all your file I/O within a 'with' definition.

In [None]:
# Approach 1

readableFile = open('inputFile.txt', 'r')

fileData = readableFile.read()

readableFile.close()


# Approach 2
with open('inputFile.txt', 'r') as myFile:
    fileData = myFile.read()

Example code blocks involving reading / writing files won't work unless you have the files locally. Accordingly I've provided the *write* examples before the *read* examples so you can generate your own files.

**NOTE:** The functions will create (or attempt to create) a 'data' folder to store the written files. Feel free to comment those lines out and adjust the file paths as needed if your permissions are set to not allow this.

**Write Examples:**

There are two key ways to 'write' a file -- `write (w)` and `append (a)`. Write will overwrite the file if it exists (so be careful!) while append will write anything you add to the end of the file. They are both dangerous in different ways.

In [None]:
import os

if not os.path.exists('data'):
    os.mkdir("data")

with open('data/temp1.txt', 'w') as myFile:
    for i in range(10):
        myFile.write(str(i))
    myFile.write("\n")
    myFile.write("Line 2")

myFile = open('data/temp2.txt', 'w')
for i in range(5):
    myFile.write(str(i) + "\n")
myFile.close()

In [None]:
with open('data/temp2.txt', 'a') as myFile:
    myFile.write("Hello World!\n")

**Read Examples:** 

`readlines()`, `read()` and `readline()` each parse a different 'amount' of the input file. When might you prefer one over the other?

In [None]:
with open('data/temp1.txt') as myFile:
    inList = myFile.readlines()
print(inList)

myFile = open('data/temp2.txt')
for i in range(10):
    print("Line Content: {}".format(myFile.readline()))
myFile.close()

with open('data/temp1.txt') as myFile:
    print(myFile.read())

In [None]:
myFile = open('data/temp2.txt')
for i in range(6):
    print("Line Content: {}".format(myFile.readline().strip()))
myFile.close()

with open('data/temp2.txt') as myFile:
    for line in myFile:
        print(line.strip())

In [None]:
with open('data/temp1.txt') as myFile:
    text = myFile.read()
    print("'{}'".format(text))
    print(text.split())

    print(text.split("5"))


**Programming Practice:** Write a function that given an input file and a filename, writes a new file that contains all the same lines in reversed order.

In [None]:
# Your code here

When measuring code efficiency, there are two many metrics -- **time** and **memory** use. We can observe both in Python using built-in methods.

The **Tracemalloc** module allows us to (among other things), keep track of both the peak memory usage over a partition of code but also the current memory consumed by the code block. It's worth noting that the example below will only work once -- once pandas has been imported, it will remain 'in memory' until we restart the entire Python notebook. (There are commands which we can use to force a 'reload' of the package, but there's generally no need to do so as its rather inefficient.)

In [None]:
import tracemalloc

tracemalloc.start()

#import pandas as pd #23978913 24000645

current, peak = tracemalloc.get_traced_memory() # 605 10974

print(current, peak)

The **timeit** module allows us to measure the time it takes code snippets to run and -- depending on the function -- is capable of repeating the analysis multiple times to account for variations due to real-world complexity.

In [None]:
import timeit

def doStuff_A(n):
    total = 0
    for i in range(n):
        for j in range(n):
            total+=j

def doStuff_B(n):
    total = 0
    for i in range(n):
        for j in range(int(n/2)):
            total+=j

SETUP_CODE = '''
from __main__ import doStuff_A, doStuff_B'''

TEST_CODE= '''
doStuff_A(1000)
'''

myTime = timeit.repeat(setup=SETUP_CODE, stmt=TEST_CODE, repeat=5, number=10)
print(myTime)

TEST_CODE= '''
doStuff_B(1000)
'''
myTime = timeit.repeat(setup=SETUP_CODE, stmt=TEST_CODE, repeat=5, number=10)
print(myTime)

In [None]:
import random
import time

def timeWaste():
    stall_time = random.randint(1, 5)
    print("Wasting {} time".format(stall_time))
    time.sleep(stall_time)

start = timeit.default_timer()
for i in range(5):
    timeWaste()
end = timeit.default_timer()

print("I wasted {} time total.".format(end-start))

**Big O** or asymptotic efficiency is a much better way of analyzing algorithm (and data structure) performance. While we will discuss the mathematical definition in class, a simplified definition is simply the act of identifying an upper bound on the runtime of a function as the function input grows towards infinity.

In [None]:
# Constant Time
def constant(n):
    ops = 0
    for i in range(10):
        ops+=n
    return ops

print(constant(5))
print(constant(9001))

In [None]:
# Logarithmic Time
import math
def logarithmic(n):
    ops = 0
    for i in range(int(math.log2(n))):
        ops+=1
    return ops

print(logarithmic(5))
print(logarithmic(9001))

In [None]:
# Linear Time
def linear(n):
    ops = 0
    for i in range(n):
        ops+=1
    return ops

print(linear(5))
print(linear(9001))

In [None]:
# Quadratic Time
def quadratic(n):
    ops = 0
    for i in range(n):
        for j in range(n):
            ops+=1
    return ops

print(quadratic(5))
print(quadratic(9001))

In [None]:
def doStuff(inList1, inList2):
    c1 = 0
    for i in inList1:
        c1+=1
    
    c2 = 0
    for v1 in inList1:
        for v2 in inList2:
            c2+=1
    return c1, c2

print(doStuff([1,2,3], list(range(10))))

In [None]:
def doStuff2(inList):
    ops = 0
    size = len(inList)
    while size > 0:
        size = int(size / 2)
        ops+=1
    return ops

def doStuff3(inList1, inList2):
    ops = 0
    for i in inList1:
        ops+= doStuff2(inList2)
    return ops

print(doStuff3([1,2,3], list(range(10))))
print(doStuff3([1,2,3], list(range(100))))
print(doStuff3([1,2,3], list(range(1000))))

In [None]:
def convert_1D_to_2D(inList, rowSize):
    listLen = len(inList)
    numRows = math.ceil(listLen/rowSize)

    outList = []
    count = 0

    ops = 0
    for i in range(numRows):
        tempList = []

        for j in range(rowSize):

            if count >= listLen:
                tempList.append(-1)
            else:
                tempList.append(inList[count])

            ops+=1
            count+=1

        outList.append(tempList)

    print(ops)
    return outList

convert_1D_to_2D([1,2,3,4,5], 10)