MSE 598-DM Spring 2021

Much information is contained in the github repository.

Some snippets to get the data files

!git clone https://github.com/johnsonht/bowtie_tutorial.git 
import glob
for f in glob.glob("bowtie_tutorial/Wafer_Data_Files/*"):
  print(f)

Day 1

There is an example ipynb file here

Third party modules: numpy, matplotlib, glob2

Supplementary images:

images/5x_images: IR Transmission, Shear0 and Shear45 .png images of the dataset
images/hot_pixel_removal: Before and after image for removing hypersensitive pixels
images/subtraction_images: Examples of subtraction images and before and after images

Order of scripts:

  1. reading_the_data.py

    Read, parse, and recreate images from a .dt1 file. Plot and save 25 Light, Shear0, and Shear45 images extracted from the .dt1 files.

  2. inspect_the_data.py

    Use a multiset to identify pixels that frequently have a large value. Replace these pixels with the mean of their neighbors.

  3. apply_subtraction_image.py

    Create and apply a subtraction image.

Assignment:

Find one of the shear zero or shear 45 images that, when cleaned by removing hot pixels and corrected with a subtraction image, shows bowtie images more clearly. Turn in the “before” and “after” .png images.

Note: if it’s easier, you may choose to work (in colab, for example) from the python notebook named bowtie_tutorial_day_1.ipynb which contains all of the same information as the python scripts listed above.

Day 2

Today, again, get all of the data you need from the johnsonht/bowtie_tutorial github repo. Then run bowtie_tutorial_day2.ipynb in colab. Fix paths as needed. (Sorry!)

Part 1: Annotate Some Images

Here we load a clean image, divide it into an nxn grid, and highlight the largest pixel in each location (which is a place where we might find a bowtie). Then someone has to look at each candidate bowtie and classify it for us. Bowtie or non-bowtie?

Part 2: Train an SGD Classifier

Here we feed in a training set prepared in Part 1, and train a classifier algorithm. There are several possibilities in a library like SciKit-Learn. For this tutorial, we will use a particular Stochastic Gradient Descent method, but our code will also demonstrate a couple of alternatives.

Part 3: Implement the Classifier

Here we load a different image, divide it into an NxN grid, highlight the largest pixel in each location, and then ask the trained classifier to determine if the location is the site of a bowtie. We then mark the bowties and the nonbowties.