mp_train

Terrible Trains

Due: Sep 29, 23:59 PM

Learning Objectives

  • Practice file IO on CSVs
  • Practice using list ADT functions implemented with linked lists
  • Use object-oriented programming to organize a complex multi-dimensional dataset as a linked list

Optionally:

  • Experience error correction on data that should not be thrown out if possible (most real world data)
  • Demonstrate mastery of OOP through a complex multi-instance class function

Assignment Description

As the new train conductor for Loyal Linked Listings incorporated (LLL inc), you have been given control of the leading car of a theoretically infinite-length train – an Infinity Train of sorts. Given a slightly modified version of the linked list code covered in class, your job is demonstrate your proficiency with the exisitng Train infrastructure and implement methods to load a custom Train from a csv file of TrainCars.

Train conductor’s looking for a more advanced exercise can additionally improve the csv loading process to make sure as many as possible TrainCars are properly loaded from the manifest and write a new function for your Infinity Train – the ability to merge cars with another train (assuming both Trains are sorted in ascending order).

You should begin both exercises by familiarizing yourself with the existing implementation of the Train and TrainCar classes. There are small differences between what was shown in class and what has been implemented for you.

Part 1: Demonstrate proficiency with existing infrastructure

As it’s your first day on the job, LLL wants you to demonstrate proficiency with the baseline Train object. To do this they have given you three tasks dealing with inserting, removing, and accessing elements in a Train object. You are strongly encouraged to use the built-in functions covered in class and provided in the assignment (add, index, insert, remove, pop)

buildDefaultTrain()

None Train.buildDefaultTrain()
# Input: None
# Output: None
# Should construct a five car default train and
# should set Train's head variable to the first car

The first test of your knowledge is to construct a ‘default’ Train which is comprised of five trains that must be stored in the given order:

  • Index 0: (Class: 1, Passengers: 20, Cargo: 40)
  • Index 1: (Class: 2, Passengers: 50, Cargo: 100)
  • Index 2: (Class: 2, Passengers: 60, Cargo: 100)
  • Index 3: (Class: 3, Passengers: 100, Cargo: 100)
  • Index 4: (Class: 3, Passengers: 150, Cargo: 200)

NOTE: The implementation for add() and insert() take in a TrainCar not a single data value (TrainCars have multiple stored values).

removeEveryOther()

None Train.removeEveryOther()
# Input: None
# Output: None
# Starting with the first car, remove every other car from the Train
# Ex: A length 5 train 0->1->2->3->4 would remove indices 0,2,4
# And the resulting train would be 1->3 with 1 as the new head

The second test of your knowledge is to take an existing Train and remove every other TrainCar starting with the first car. When implementing this function, you may find it helpful to review how a Python range() can be defined.

NOTE: Take a close look at the difference in input for remove() and pop() when deciding which function to use here. Remember that your solution must work for all possible trains, not just the default one.

HINT: If you remove the first TrainCar first, what happens to the index values for all subsequent cars? Is there a simple way to get around this complication?

HINT 2: If you have a list of values l=[0, 2, 4], a useful Python function is l.reverse() which will invert the list.

totalPassengers()

int Train.totalPassengers()
# Input: None
# Output: an integer recording the total passengers on all cars in the Train

The final test of your knowledge is to walk through each TrainCar in the Train and sum up all passengers in all cars. For example, the ‘default’ Train has 150+100+60+50+20 = 380 passengers.

HINT: You can do this in many ways – some of the easiest to code involve either using index() directly or demonstrating your understanding of what index is actually doing. The latter is (probably) significantly more efficient code but both are valid solutions here.

Part 2: Parse a CSV to produce a Train

Now that you are a veteran train conductor, your next task is to become a veteran data scientist. LLL incorporated has given you several CSV formatted files and tasked you with converting these text documents into functional Train objects. To assist you with writing and testing the code, this singular task has been broken into two functions.

parseTrain()

TrainCar parseTrain(str inString)
# Input: 
# inString, an input string storing a single row of a csv
# Here this will always have the form 'x,y,z'
# Output: 
# A TrainCar object storing the head (first car) listed in the file
# The TrainCar object should be a complete in-order linked list of all valid cars
# in the input file, including corrections (if you are attempting it)

Given a text string of the form “x,y,z” for passenger class, number passengers, and cargo respectively return a TrainCar object storing all three values as integers.

To review the key parts here:

  • The input string must be split up into multiple sub-strings
  • Each substring has to be converted to an integer (or at least be convertable – see TrainCar constructor)
  • Those integers should be stored in a TrainCar object

loadTrain()

# Input: 
# inFile, an input string storing the path and filename of an input .csv file
# Output: 
# A TrainCar object storing the head (first car) listed in the file
# The TrainCar object should be a complete in-order linked list of all valid cars
# in the input file, including corrections made when possible.

Given a CSV file as inFile, loadTrain() should open the file and parse each line as an individual TrainCar. You may find it very helpful to use parseTrain() but should make sure that the formatting matches the expected input. For example, it might be a good idea to strip() the line of whitespace before parsing it depending on your implementation strategy.

You should also pay close attention to the fact that the Train should store objects in the same order the lines are read in. If you repeatedly add() each TrainCar as you see it, will this result in the correct order? What other approaches might work?

To review the key parts here:

  • An input file should be read in line-by-line
  • Each line, when formatted properly, should be sent to parseTrain() to create a TrainCar object
  • A Train should be built from the TrainCar objects

NOTE: You should open (and manually read) the input CSV before determing how to parse the file. Is there a line in the file which shouldn’t be parsed? What will happen if you try? How can you avoid it?

HINT: While most linked lists do not include a ‘size’ variable, I have implemented one for convenience. You may find the variable helpful when writing loadTrain().

Optional Extensions

If you feel especially comfortable or would like to challenge your understanding of Python classes and linked lists, try either or both of the following exercise on for size!

loadTrain() redux

If you are comfortable with the assignment and are looking for a challenge consider that real-world CSVs often have missing values. In this context, we don’t want to throw out these rows if we can help it – its equivalent to misplacing an entire TrainCar after all! As an optional and more challenging implementation of loadTrain(), LLL has provided a set of guidelines for how to handle missing data:

  1. If the TrainCar is missing a value for its ‘number of cargo’, ignore the entry as it is “unrecoverable”.
  2. If the TrainCar is missing a passenger class, assume it is the worst possible class (Class = 3)
  3. If the TrainCar is missing a value for its ‘number of passengers’, set the number of passengers equal to the number of cargo unless the number of cargo exceeds the maximum passenger size. (If this is the case, use the maximum passenger size as the number instead)

Each class of TrainCar (first class, second class, or third class) has its own unique passenger maximum:

  • First class trains can hold at most 50 passengers
  • Second class trains can hold at most 100 passengers
  • Third class trains can hold at most 150 passengers

Train.merge()

None Train.merge(Train otherT, ival)
# Input:
# otherT, a *different* instance of a Train
# ival, an integer representing which value to sort by
# Output: None
# merge should take the contents of self (the current Train) and merge it
# with the contents of otherT (another Train object).
# You may assume both Trains are sorted in ascending order based on the value
# of ival (0 = class, 1 = passenger number, 2 = cargo number)

Given two Trains (self and otherT), write a function that merges the contents of both into the current Train (self). The logic behind this should work as follows:

  1. Create a third Train (or dummy TrainCar) that will store the merged list (either as head or as dummy.next)
  2. Compare the current value of self and otherT and insert the smaller of the two values into the merged list.
  3. Increment whichever Train you just inserted so that current points to the next value in the list.
  4. Repeat 2-3 until one Train is completely inserted (and the current value is None)
  5. Append the rest of the remaining Train to the end of the merged list.
  6. Set the value of self.head equal to the first merged value. (If dummy TrainCar this should be the second value!)

If you’ve successfully completed this algorithm – congratulations you have written the more difficult part of mergeSort(), which we will cover later in the course.