[ALWAYS] UNDER CONSTRUCTION...
Learn MoreNEXT PROJECT
Who is Jordan Bell?
A Statistical look at the former Oregon Fighting Duck and newest Golden State Warrior, Jordan Bell.

[CS231n]
E1Q2: Training a Support Vector MachineA Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane and are used with associated learning algorithms to analyze data used for classification and regression analysis. Here we will implement and optimize a fullyvectorized loss function and analytic gradient to better classify the images in the CIFAR10 image set.
6
[CS231n] E1Q2: Training a Support Vector MachineARTICLE 6 
[CS231n]
E1Q1: kNearest Neighbor ClassifierIn pattern recognition, the knearest neighbors algorithm (kNN) is a nonparametric method used for classification and regression. In this assignment we utilize a very basic implementation to predict image labels in the CIFAR10 Dataset.
5
[CS231n] E1Q1: kNearest Neighbor Classifier[CS231n]
E1Q1: kNearest Neighbor ClassifierBy Cole Page
Published June. 27, 2017
The kNN classifier consists of two stages:
 During training, the classifier takes the training data and simply remembers it
 During testing, kNN classifies every test image by comparing to all training images and transfering the labels of the k most similar training examples
 The value of k is crossvalidated
In the following we will implement these steps to better understand the basic Image Classification pipeline, crossvalidation, and gain proficiency in writing efficient, vectorized code.
INITIAL SETUP
Q1: kNearest Neighbor classifier¶
In [1]:# Run some setup code for this notebook. import random import numpy as np from cs231n.data_utils import load_CIFAR10 import matplotlib.pyplot as plt from __future__ import print_function # This is a bit of magic to make matplotlib figures appear inline in the notebook # rather than in a new window. %matplotlib inline plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots plt.rcParams['image.interpolation'] = 'nearest' plt.rcParams['image.cmap'] = 'gray' # Some more magic so that the notebook will reload external python modules; # see http://stackoverflow.com/questions/1907993/autoreloadofmodulesinipython %load_ext autoreload %autoreload 2
In [2]:# Load the raw CIFAR10 data. cifar10_dir = 'cs231n/datasets/cifar10batchespy' X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir) # As a sanity check, we print out the size of the training and test data. print('Training data shape: ', X_train.shape) print('Training labels shape: ', y_train.shape) print('Test data shape: ', X_test.shape) print('Test labels shape: ', y_test.shape)
Training data shape: (50000, 32, 32, 3) Training labels shape: (50000,) Test data shape: (10000, 32, 32, 3) Test labels shape: (10000,)
In [3]:# Visualize some examples from the dataset. # We show a few examples of training images from each class. classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] num_classes = len(classes) samples_per_class = 7 for y, cls in enumerate(classes): idxs = np.flatnonzero(y_train == y) idxs = np.random.choice(idxs, samples_per_class, replace=False) for i, idx in enumerate(idxs): plt_idx = i * num_classes + y + 1 plt.subplot(samples_per_class, num_classes, plt_idx) plt.imshow(X_train[idx].astype('uint8')) plt.axis('off') if i == 0: plt.title(cls) plt.show()
In [4]:# Subsample the data for more efficient code execution in this exercise num_training = 5000 mask = list(range(num_training)) X_train = X_train[mask] y_train = y_train[mask] num_test = 500 mask = list(range(num_test)) X_test = X_test[mask] y_test = y_test[mask]
In [5]:# Reshape the image data into rows X_train = np.reshape(X_train, (X_train.shape[0], 1)) X_test = np.reshape(X_test, (X_test.shape[0], 1)) print(X_train.shape, X_test.shape)
(5000, 3072) (500, 3072)
In [6]:from cs231n.classifiers import KNearestNeighbor # Create a kNN classifier instance. # Remember that training a kNN classifier is a noop: # the Classifier simply remembers the data and does no further processing classifier = KNearestNeighbor() classifier.train(X_train, y_train)
We would now like to classify the test data with the kNN classifier. Recall that we can break down this process into two steps:
1. First we must compute the distances between all test examples and all train examples.
2. Given these distances, for each test example we find the k nearest examples and have them vote for the label
Lets begin with computing the distance matrix between all training and test examples. For example, if there are Ntr training examples and Nte test examples, this stage should result in a Nte x Ntr matrix where each element (i,j) is the distance between the ith test and jth train example.
First, open `cs231n/classifiers/k_nearest_neighbor.py` and implement the function `compute_distances_two_loops` that uses a (very inefficient) double loop over all pairs of (test, train) examples and computes the distance matrix one element at a time.
We would now like to classify the test data with the kNN classifier. Recall that we can break down this process into two steps:
 First we must compute the distances between all test examples and all train examples.
 Given these distances, for each test example we find the k nearest examples and have them vote for the label
Lets begin with computing the distance matrix between all training and test examples. For example, if there are Ntr training examples and Nte test examples, this stage should result in a Nte x Ntr matrix where each element (i,j) is the distance between the ith test and jth train example.
First, open
cs231n/classifiers/k_nearest_neighbor.py
and implement the functioncompute_distances_two_loops
that uses a (very inefficient) double loop over all pairs of (test, train) examples and computes the distance matrix one element at a time.In [7]:# Open cs231n/classifiers/k_nearest_neighbor.py and implement # compute_distances_two_loops. # Test your implementation: dists = classifier.compute_distances_two_loops(X_test) print("Shape: " + str(dists.shape))
Computing Distance: 0 Computing Distance: 25 Computing Distance: 50 Computing Distance: 75 Computing Distance: 100 Computing Distance: 125 Computing Distance: 150 Computing Distance: 175 Computing Distance: 200 Computing Distance: 225 Computing Distance: 250 Computing Distance: 275 Computing Distance: 300 Computing Distance: 325 Computing Distance: 350 Computing Distance: 375 Computing Distance: 400 Computing Distance: 425 Computing Distance: 450 Computing Distance: 475 Shape: (500, 5000)
In [23]:# We can visualize the distance matrix: each row is a single test example and # its distances to training examples plt.imshow(dists, interpolation='none') plt.show()
INLINE QUESTION #1:
1. What in the data is the cause behind the distinctly bright rows?
The rows represent each test image and every column in that row is it's similarity to the 5000 training images. If the pixel for row i column j is white then we see high distances, or rather no similarity between test and training image. Thus a row (test image) that is distinctly bright means that the test image has very few similar images in the entire training set.
2. What causes the columns?
Similar to the explanation above, the white columns represent images in the training set that are similar to very few of the test images, as most of the 0500 test images show large L2 distances (bright).
INLINE QUESTION #1:
The rows represent each test image and every column in that row is it's similarity to the 5000 training images. If the pixel for row i column j is white then we see high distances, or rather no similarity between test and training image. Thus a row (test image) that is distinctly bright means that the test image has very few similar images in the entire training set.
Similar to the explanation above, the white columns represent images in the training set that are similar to very few of the test images, as most of the 0500 test images show large L2 distances (bright).
In [42]:# Now implement the function predict_labels and run the code below: # We use k = 1 (which is Nearest Neighbor). y_test_pred = classifier.predict_labels(dists, k=1) # Compute and print the fraction of correctly predicted examples num_correct = np.sum(y_test_pred == y_test) accuracy = float(num_correct) / num_test print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
Got 137 / 500 correct => accuracy: 0.274000
In [109]:y_test_pred = classifier.predict_labels(dists, k=5) num_correct = np.sum(y_test_pred == y_test) accuracy = float(num_correct) / num_test print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
Got 145 / 500 correct => accuracy: 0.290000
In [46]:# Now lets speed up distance matrix computation by using partial vectorization # with one loop. Implement the function compute_distances_one_loop and run the # code below: dists_one = classifier.compute_distances_one_loop(X_test)
Computing Distance: 0 Computing Distance: 25 Computing Distance: 50 Computing Distance: 75 Computing Distance: 100 Computing Distance: 125 Computing Distance: 150 Computing Distance: 175 Computing Distance: 200 Computing Distance: 225 Computing Distance: 250 Computing Distance: 275 Computing Distance: 300 Computing Distance: 325 Computing Distance: 350 Computing Distance: 375 Computing Distance: 400 Computing Distance: 425 Computing Distance: 450 Computing Distance: 475
In [47]:# To ensure that our vectorized implementation is correct, we make sure that it # agrees with the naive implementation. There are many ways to decide whether # two matrices are similar; one of the simplest is the Frobenius norm. In case # you haven't seen it before, the Frobenius norm of two matrices is the square # root of the squared sum of differences of all elements; in other words, reshape # the matrices into vectors and compute the Euclidean distance between them. difference = np.linalg.norm(dists  dists_one, ord='fro') print('Difference was: %f' % (difference, )) if difference < 0.001: print('Good! The distance matrices are the same') else: print('Uhoh! The distance matrices are different')
Difference was: 0.000000 Good! The distance matrices are the same
In [59]:# Now implement the fully vectorized version inside compute_distances_no_loops # and run the code dists_two = classifier.compute_distances_no_loops(X_test) # check that the distance matrix agrees with the one we computed before: difference = np.linalg.norm(dists  dists_two, ord='fro') print('Difference was: %f' % (difference, )) if difference < 0.001: print('Good! The distance matrices are the same') else: print('Uhoh! The distance matrices are different')
Difference was: 0.000000 Good! The distance matrices are the same
In [53]:# Let's compare how fast the implementations are def time_function(f, *args): """ Call a function f with args and return the time (in seconds) that it took to execute. """ import time tic = time.time() f(*args) toc = time.time() return toc  tic two_loop_time = time_function(classifier.compute_distances_two_loops, X_test) print('Two loop version took %f seconds' % two_loop_time) one_loop_time = time_function(classifier.compute_distances_one_loop, X_test) print('One loop version took %f seconds' % one_loop_time) no_loop_time = time_function(classifier.compute_distances_no_loops, X_test) print('No loop version took %f seconds' % no_loop_time) # you should see significantly faster performance with the fully vectorized implementation
FINISHED WITH TEST IMAGE: 0 FINISHED WITH TEST IMAGE: 50 FINISHED WITH TEST IMAGE: 100 FINISHED WITH TEST IMAGE: 150 FINISHED WITH TEST IMAGE: 200 FINISHED WITH TEST IMAGE: 250 FINISHED WITH TEST IMAGE: 300 FINISHED WITH TEST IMAGE: 350 FINISHED WITH TEST IMAGE: 400 FINISHED WITH TEST IMAGE: 450 Two loop version took 32.644345 seconds FINISHED WITH TEST IMAGE: 0 FINISHED WITH TEST IMAGE: 50 FINISHED WITH TEST IMAGE: 100 FINISHED WITH TEST IMAGE: 150 FINISHED WITH TEST IMAGE: 200 FINISHED WITH TEST IMAGE: 250 FINISHED WITH TEST IMAGE: 300 FINISHED WITH TEST IMAGE: 350 FINISHED WITH TEST IMAGE: 400 FINISHED WITH TEST IMAGE: 450 One loop version took 75.630863 seconds No loop version took 0.413558 seconds
CROSS VALIDATION
We have implemented the kNearest Neighbor classifier but we set the value k = 5 arbitrarily. We will now determine the best value of this hyperparameter with crossvalidation.
In [142]:num_folds = 5 k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100] X_train_folds = [] y_train_folds = [] #use numpy array_split function to split the training data into num_folds folds X_train_folds = np.array_split(X_train, num_folds) y_train_folds = np.array_split(y_train, num_folds) # A dictionary holding the accuracies for different values of k that we find # when running crossvalidation. After running crossvalidation, # k_to_accuracies[k] should be a list of length num_folds giving the different # accuracy values that we found when using that value of k. k_to_accuracies = {} for k in k_choices: k_to_accuracies[k] = [] # Run kNN algorithm num_folds times for i in range(num_folds): X_train = [] y_train = [] for j in range(num_folds): if i != j: X_train.extend(X_train_folds[j]) y_train.extend(y_train_folds[j]) X_train = np.array(X_train) y_train = np.array(y_train) classifier = KNearestNeighbor() classifier.train(X_train, y_train) dists = classifier.compute_distances_no_loops(X_test) y_test_pred = classifier.predict_labels(dists, k=k) num_correct = np.sum(y_test_pred == y_test) accuracy = float(num_correct) / num_test k_to_accuracies[k].append(accuracy) # Print out the computed accuracies for k in sorted(k_to_accuracies): for accuracy in k_to_accuracies[k]: print('k = %d, accuracy = %f' % (k, accuracy))
k = 1, accuracy = 0.258000 k = 1, accuracy = 0.276000 k = 1, accuracy = 0.260000 k = 1, accuracy = 0.250000 k = 1, accuracy = 0.254000 k = 3, accuracy = 0.276000 k = 3, accuracy = 0.280000 k = 3, accuracy = 0.262000 k = 3, accuracy = 0.272000 k = 3, accuracy = 0.252000 k = 5, accuracy = 0.284000 k = 5, accuracy = 0.294000 k = 5, accuracy = 0.272000 k = 5, accuracy = 0.268000 k = 5, accuracy = 0.280000 k = 8, accuracy = 0.280000 k = 8, accuracy = 0.282000 k = 8, accuracy = 0.282000 k = 8, accuracy = 0.250000 k = 8, accuracy = 0.290000 k = 10, accuracy = 0.274000 k = 10, accuracy = 0.286000 k = 10, accuracy = 0.278000 k = 10, accuracy = 0.260000 k = 10, accuracy = 0.270000 k = 12, accuracy = 0.282000 k = 12, accuracy = 0.266000 k = 12, accuracy = 0.272000 k = 12, accuracy = 0.276000 k = 12, accuracy = 0.280000 k = 15, accuracy = 0.278000 k = 15, accuracy = 0.270000 k = 15, accuracy = 0.250000 k = 15, accuracy = 0.262000 k = 15, accuracy = 0.270000 k = 20, accuracy = 0.274000 k = 20, accuracy = 0.254000 k = 20, accuracy = 0.242000 k = 20, accuracy = 0.258000 k = 20, accuracy = 0.274000 k = 50, accuracy = 0.240000 k = 50, accuracy = 0.234000 k = 50, accuracy = 0.234000 k = 50, accuracy = 0.246000 k = 50, accuracy = 0.234000 k = 100, accuracy = 0.230000 k = 100, accuracy = 0.218000 k = 100, accuracy = 0.224000 k = 100, accuracy = 0.224000 k = 100, accuracy = 0.224000
In [143]:# plot the raw observations for k in k_choices: accuracies = k_to_accuracies[k] plt.scatter([k] * len(accuracies), accuracies) # plot the trend line with error bars that correspond to standard deviation accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())]) accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())]) plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std) plt.title('Crossvalidation on k') plt.xlabel('k') plt.ylabel('Crossvalidation accuracy') plt.show()
In [152]:# Based on the crossvalidation results above, choose the best value for k, # retrain the classifier using all the training data, and test it on the test # data. You should be able to get above 28% accuracy on the test data. best_k = 8 classifier = KNearestNeighbor() classifier.train(X_train, y_train) y_test_pred = classifier.predict(X_test, k=best_k) # Compute and display the accuracy num_correct = np.sum(y_test_pred == y_test) accuracy = float(num_correct) / num_test print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
Got 145 / 500 correct => accuracy: 0.290000
KNEAREST NEIGHBOR CLASS
The following code contains the kNN class and it's containing functions that I use in the above analysis.
In [ ]:import numpy as np from past.builtins import xrange from collections import Counter class KNearestNeighbor(object): """ a kNN classifier with L2 distance """ def __init__(self): pass def train(self, X, y): """ Train the classifier. For knearest neighbors this is just memorizing the training data. Inputs:  X: A numpy array of shape (num_train, D) containing the training data consisting of num_train samples each of dimension D.  y: A numpy array of shape (N,) containing the training labels, where y[i] is the label for X[i]. """ self.X_train = X self.y_train = y def predict(self, X, k=1, num_loops=0): """ Predict labels for test data using this classifier. Inputs:  X: A numpy array of shape (num_test, D) containing test data consisting of num_test samples each of dimension D.  k: The number of nearest neighbors that vote for the predicted labels.  num_loops: Determines which implementation to use to compute distances between training points and testing points. Returns:  y: A numpy array of shape (num_test,) containing predicted labels for the test data, where y[i] is the predicted label for the test point X[i]. """ if num_loops == 0: dists = self.compute_distances_no_loops(X) elif num_loops == 1: dists = self.compute_distances_one_loop(X) elif num_loops == 2: dists = self.compute_distances_two_loops(X) else: raise ValueError('Invalid value %d for num_loops' % num_loops) return self.predict_labels(dists, k=k) def compute_distances_two_loops(self, X): """ Compute the distance between each test point in X and each training point in self.X_train using a nested loop over both the training data and the test data. Inputs:  X: A numpy array of shape (num_test, D) containing test data. Returns:  dists: A numpy array of shape (num_test, num_train) where dists[i, j] is the Euclidean distance between the ith test point and the jth training point. """ num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) for i in xrange(num_test): if i%50 == 0: print("FINISHED WITH TEST IMAGE: ",i) for j in xrange(num_train): dists[i, j] = np.sqrt(np.sum((X[i, :]  self.X_train[j, :]) ** 2)) return dists def compute_distances_one_loop(self, X): """ Compute the distance between each test point in X and each training point in self.X_train using a single loop over the test data. Input / Output: Same as compute_distances_two_loops """ num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) for i in xrange(num_test): if i%50 == 0: print("FINISHED WITH TEST IMAGE: ",i) dists[i, :] = np.sqrt(np.sum(np.square(self.X_train  X[i, :]), axis=1)) return dists def compute_distances_no_loops(self, X): """ Compute the distance between each test point in X and each training point in self.X_train using no explicit loops. Input / Output: Same as compute_distances_two_loops """ num_test = X.shape[0] num_train = self.X_train.shape[0] dists = np.zeros((num_test, num_train)) # (xy)^2 = x^2 + y^2  2xy > test_sum + train_sum  2*inner_product test_sum = np.sum(np.square(X), axis=1) # shape > (500,) train_sum = np.sum(np.square(self.X_train), axis=1) # shape > (5000,) inner_product = np.dot(X, self.X_train.T) # shape > (500,5000) # reshape test_sum from (500,) to (500,1) while keeping same data # the 1 infers same shape as before (500) dists = np.sqrt(test_sum.reshape(1, 1) + train_sum  2*inner_product) return dists def predict_labels(self, dists, k=1): """ Given a matrix of distances between test points and training points, predict a label for each test point. Inputs:  dists: A numpy array of shape (num_test, num_train) where dists[i, j] gives the distance betwen the ith test point and the jth training point. Returns:  y: A numpy array of shape (num_test,) containing predicted labels for the test data, where y[i] is the predicted label for the test point X[i]. """ num_test = dists.shape[0] y_pred = np.zeros(num_test) for i in xrange(num_test): # A list of length k storing the labels of the k nearest neighbors to # the ith test point. closest_y = [] #while looping through i's get the distances of each images (i) with every #training image and save that as a new numpy array "dists_i" dists_i = dists[i] #dists_i.argsort() gives the indices of sorted distances, low to high. #dists_i.argsort()[:k], gives k lowest distance indices (k Nearest Neighbors) #y_train["this lowest distance indice"] gives the labels of that training img #this array closest_y is of len=k closest_y = self.y_train[dists_i.argsort()[:k]] #choose the most common label in closest_y #(ties broken with lowest label...Counter does this....) y_pred[i] = Counter(closest_y).most_common(1)[0][0] return y_pred
ARTICLE 5 
Stanford University's CS231n:
Convolutional Neural Networks for Visual RecognitionFeeling a little rusty on my basic deep learning skills, I have decided to spend the next few weeks working through Stanford University's class, CS231n: Convolutional Neural Networks for Visual Recognition. Much of the material, especially the raw Python, is simply a refresher for me, but a lot of the application is new as I have not worked extensivly on applying machine learning to imaging problems. As I work through the class I will be putting up my thoughts, notes, assignments, and IPython Notebooks up here as a way to keep myself honest and motivated to progress through the class in it's entirety.
4
CS231n: Convolutional Neural Networks for Visual RecognitionStanford University's CS231n:
Convolutional Neural Networks for Visual Recognition
Notes, Thoughts, Assignments, and ProgressBy Cole Page
Published June. 22, 2017
ARTICLE 4 
Embedding Jupyter Notebooks
As discussed in my post on underlying site structure, I have opted to build out every piece of this site from scratch instead of working with staticpage generators such as Pelican or Hyde. I discuss the logic behind this choice there, so I wont reiterate, but due to this choice I need to set up processes for displaying work moving forward. Jupyter Notebooks will be the most important and continuous form so I want to explain how I've decided to go about this.
3
Embedding IPython NotebooksEmbedding IPython Notebooks
By Cole Page
Published May 20, 2017
ARTICLE 3 
JaVale By the Numbers
Just prior to the start of the 20162017 NBA season when the bigman depleted Golden State Warriors signed JaVale Magee to a 1year minimum contract, the critics quickly surfaced. Articles bashing JaVale we’re numerous, and praise nonexistent. Looking at his raw averages of 6.1 points, 3.2 rebounds, and 0.9 blocks in only 9.6 minutes per game won’t blow you away, but his per36minute stats are absurd. In this article we take a look at just how impressive a starting fullminutes JaVale would rank up against the rest of the league.
2
JaVale By the NumbersJaVale By the Numbers
By Cole Page
Published May 3, 2017
Just prior to the start of the 20162017 NBA season when the bigman depleted Golden State Warriors signed JaVale Magee to a 1year minimum contract, the critics quickly surfaced. Articles bashing JaVale we’re numerous, and praise nonexistent. An article on Complex ranked JaVale as the 12th worst player in the NBA and wrote “The Golden State Warriors need help defending the paint so they’d figure, ‘Hey, JaVale is a giant human with a heartbeat, let’s sign him!’ Problem is he really isn’t that good. Maybe the organization of the future will somehow turn JaVale into an elite big man, or maybe they’re just desperate for some rim protection.”
Well, seven months later, four wins into the 2017 playoffs, it looks like the Golden State organization may have done exactly that. Not only did JaVale produce important bigman minutes off the bench all season, he started ten games in place of the injured Zaza Pachulia and put together one of the most impressive perminute statistical seasons for a center in recent memory. JaVale only averaged 9.6 minutes a game and much of his statistical dominance may be attributed to the strength of the leagueleading team surrounding him, but anyone who actually watched this season got a chance to a see a whole new JaVale: a powerfully athletic, rim destroying, ball swatting monster having a career year.
Looking at his raw averages of 6.1 points, 3.2 rebounds, and 0.9 blocks in only 9.6 minutes per game won’t blow you away, but his per36minute stats are absurd. In this article we take a look at just how impressive a starting fullminutes JaVale would rank up against the rest of the league. I’m not suggesting that increasing his play time would effect his stats linearly like this, and with Golden State’s depth and smallball play we most likely aren’t going to find out any time soon, but the Warriors are without a doubt playing some of their best basketball when he's on the floor, and so far JaVale has silenced most if not all of his critics; looking at you Shaq.
Shooting
Dunks
Of all the stats I am going to talk about today, this one should be the least surprising and is without a doubt the most dominant. Through all his blunders and mistakes over the years, JaVale has never failed to thrown down with authority. At 7’ tall and sporting an 8’ wingspan, there aren’t many people on the planet who have that kind of range around the rim. Hometown bias and NBA sponsorship (KIA) aside, JaVale probably should have won the 2011 Slam Dunk Contest, and annually puts up highlight reel dunks. With a fraction of the minutes played, JaVale still ranked 12th in total dunks this season. Adjusted, JaVale is averaging 5.89 blocks/36min, a whopping 2.35 more than this seasons dunk leader DeAndre Jordan. The Clippers might still hold the title of Lob City but nobody has dominated the AllyOop this season like Draymond, Andre, and Steph chucking up bombs to JaVale McGee.
RANK PLAYER TEAM MINUTES DUNKS Dunks / 36min RANK PLAYER TEAM MINUTES DUNKS Dunks / 36min 1 JaVale McGee GSW 739 121 5.89 11 Marquese Chriss PHX 1743 103 2.13 2 Clint Capela HOU 1551 163 3.78 12 Jabari Parker MIL 1728 92 1.92 3 DeAndre Jordan LAC 2570 253 3.54 13 Tristan Thompson CLE 2336 122 1.88 4 Montrezl Harrell HOU 1064 98 3.32 14 LeBron James CLE 2794 145 1.87 5 Dwight Howard ATL 2199 199 3.26 15 Kevin Durant GSW 2070 107 1.86 6 Rudy Gobert UTA 2744 235 3.08 16 Anthony Davis NOP 2708 135 1.79 7 Richaun Holmes PHI 1193 92 2.78 17 Andre Drummond DET 2409 118 1.76 8 Giannis Antetokounmpo MIL 2845 194 2.45 18 Steven Adams OKC 2389 112 1.69 9 Hassan Whiteside MIA 2513 163 2.34 19 Aaron Gordon ORL 2298 99 1.55 10 Mason Plumlee DEN 2147 132 2.21 20 KarlAnthony Towns MIN 3030 130 1.54 Dunks Per 36 Min Still unconvinced? Check out all 121 of JaVale's Dunks below.
Points in the Paint
When your per36 adjusted dunks are this far ahead of the competition, it’s not too much of a surprise that his adjusted points in the paint would reflect the same trend. Playing on a team that dished it at historic numbers this year and constantly looks for the 3rd pass doesn’t hurt either. Add that to the absurd gravitational pull of Golden State’s guards on the 3 point line and you have yourself a bigman PITP feeding frenzy on a nightly basis, something JaVale has happily taken advantage of this season. JaVale’s 20.00 PITP per36 is the highest since Shaquille O’Neal’s 20.00 in 20012002. In fact, in the past 20 years no other player even eclipsed 15.5 PITP per36, Shaq pulling off the feat for a dominant 11 straight years between 19962007.
RANK PLAYER TEAM PITP MIN PITP / 36MIN RANK PLAYER TEAM PITP MIN PITP / 36MIN 1 JaVale McGee GSW 410 739 20.00 11 DeMarcus Cousins NOP 848 2465 12.38 2 Clint Capela HOU 720 1551 16.71 12 Dwight Howard ATL 748 2199 12.25 3 Enes Kanter OKC 704 1533 16.53 13 Brook Lopez BKN 732 2222 11.86 4 Hassan Whiteside MIA 970 2513 13.90 14 DeAndre Jordan LAC 824 2570 11.54 4 KarlAnthony Towns MIN 1154 3030 13.71 15 Steven Adams OKC 742 2389 11.18 6 LeBron James CLE 1032 2794 13.30 16 Rudy Gobert UTA 822 2744 10.78 7 Giannis Antetokounmpo MIL 1044 2845 13.21 17 Russell Westbrook OKC 816 2802 10.48 8 Andre Drummond DET 882 2409 13.18 18 DeMar DeRozan TOR 760 2620 10.44 9 Anthony Davis NOP 970 2708 12.90 19 Isaiah Thomas BOS 726 2569 10.17 10 Nikola Jokic DEN 730 2038 12.89 20 John Wall WAS 774 2835 9.83 Points in the Paint Per 36 Min Offensive Rating
Individual offensive rating is a tough stat to separate from the play of the team as a whole, but ranking #1 even when the rest of your team is in the top 10 still tells an impressive story. Not only does JaVale fit into the offensive juggernaut that is the Golden State Warriors, he makes them the most potent version of themselves when he’s on the floor.
5Man Lineup OFFRTG DEFRTG NETRTG GSW Big 4 + JaVale 124.4 92.2 32.1 The Warriors lineup with McGee/Curry/Thompson/Durant/Green has the highest net rating of any fiveman combo in the NBA this season with a minimum of 100 min played.
RANK PLAYER TEAM OFFRTG RANK PLAYER TEAM OFFRTG 1 JaVale McGee GSW 121.4 11 Nikola Jokic DEN 114.9 2 Stephen Curry GSW 118.1 12 LeBron James CLE 114.9 3 Pierre Jackson DAL 117.9 13 JJ Redick LAC 114.6 4 Kevin Durant GSW 117.2 14 Jordan Farmar SAC 114.5 5 Chris Paul LAC 116.2 15 Andre Iguodala GSW 114.3 6 Zaza Pachulia GSW 115.8 16 Kyrie Irving CLE 114.2 7 Klay Thompson GSW 115.6 17 Ryan Anderson HOU 113.8 8 Draymond Green GSW 115.2 18 Clint Capela HOU 113.7 9 Blake Griffin LAC 115.2 19 Isaiah Thomas BOS 113.6 10 Gary Harris DEN 115.0 20 James Harden HOU 113.6 Offensive Rating PLUS/MINUS
Plus/Minus further illustrates how the already outstanding Warriors are even better when JaVale is on the floor. It’s not surprising that the team with the NBA’s best record, #1 Offensive Efficiency and #2 Defensive Efficiency would dominate the plus/minus category.
PLAYER TEAM +/ MIN PLUSMINUS / 36min RANK PLAYER TEAM +/ MIN PLUSMINUS / 36min 1 JaVale McGee GSW 312 739 15.20 11 Blake Griffin LAC 440 2076 7.63 2 Stephen Curry GSW 1015 2638 13.85 11 Ryan Anderson HOU 407 2116 6.92 3 Kevin Durant GSW 711 2070 12.37 13 DeAndre Jordan LAC 459 2570 6.43 4 Draymond Green GSW 820 2471 11.95 14 Kawhi Leonard SAS 436 2474 6.34 5 Zaza Pachulia GSW 418 1268 11.87 15 LeBron James CLE 483 2794 6.22 6 Klay Thompson GSW 801 2649 10.89 16 Patrick Beverley HOU 353 2058 6.17 7 Chris Paul LAC 577 1921 10.81 17 Kyle Lowry TOR 358 2244 5.74 8 Andre Iguodala GSW 527 1998 9.50 18 Rudy Gobert UTA 436 2744 5.72 9 Patty Mills SAS 410 1754 8.42 19 Jae Crowder BOS 349 2335 5.38 10 JJ Redick LAC 470 2198 7.70 20 James Harden HOU 425 2947 5.19 Plus/Minus Per 36 Min SECOND CHANCE POINTS
RANK PLAYER TEAM MIN 2ND PTS 2ND PTS / 36MIN RANK PLAYER TEAM MIN 2ND PTS 2ND PTS / 36MIN 1 Enes Kanter OKC 1533 260 6.11 11 LaMarcus Aldridge SAS 2335 247 3.81 2 Hassan Whiteside MIA 2513 375 5.37 12 Russell Westbrook OKC 2802 292 3.75 3 Andre Drummond DET 2409 355 5.31 13 Anthony Davis NOP 2708 282 3.75 4 Dwight Howard ATL 2199 315 5.16 14 DeAndre Jordan LAC 2570 266 3.73 5 Zach Randolph MEM 1786 255 5.14 14 Kevin Love CLE 1885 195 3.72 6 JaVale McGee GSW 739 96 4.68 16 Robin Lopez CHI 2271 225 3.57 7 KarlAnthony Towns MIN 3030 386 4.59 16 DeMarcus Cousins NOP 2465 225 3.29 8 Nikola Jokic DEN 2038 246 4.35 18 Steven Adams OKC 2389 210 3.17 9 Jonas Valanciunas TOR 2066 240 4.18 19 Carmelo Anthony NYK 2538 210 2.98 10 Rudy Gobert UTA 2744 301 3.95 20 Jimmy Butler CHI 2809 198 2.54 2nd Chance Points Per 36 Min Defense & Shot Blocking
Blocks
JaVale’s classA airspace around the hoop doesn’t just exist on the offensive end. With premier ball stoppers like Draymond Green, Klay Thompson, and Andre Iguadala hounding the opposing offensive, JaVale consistently waits a step away to thunderously deny all shot attempts. With a 2017 block highlight reel almost as long as his dunk tape, JaVale has been nothing but impregnable around the rim, an defensive factor Golden State thought it would be sorely lacking with the departures of Andrew Bogut and Festus Ezeli. In fact preseason this was the chink in the armor that many thought could fell the reigning western conference champions. JaVale has done more than his part to change that tune.
RANK PLAYER TEAM MINUTES BLOCKS BLOCKS / 36MIN RANK PLAYER TEAM MINUTES BLOCKS BLOCKS / 36MIN 1 JaVale McGee GSW 739 67 3.26 11 DeAndre Jordan LAC 2570 134 1.88 2 Kyle O'Quinn NYK 1229 104 3.05 12 Robin Lopez CHI 2271 117 1.85 3 Rudy Gobert UTA 2744 214 2.81 13 Serge Ibaka TOR 2422 124 1.84 4 Myles Turner IND 2541 172 2.44 14 Kevin Durant GSW 2070 99 1.72 5 Hassan Whiteside MIA 2513 161 2.31 15 Draymond Green GSW 2471 106 1.54 6 Alex Len PHX 1560 98 2.26 16 Mason Plumlee DEN 2147 92 1.54 7 Anthony Davis NOP 2708 167 2.22 17 Dwight Howard ATL 2199 92 1.51 8 Kristaps Porzingis NYK 2164 129 2.15 18 Marc Gasol MEM 2531 99 1.41 9 Brook Lopez BKN 2222 124 2.01 19 DeMarcus Cousins NOP 2465 93 1.36 10 Giannis Antetokounmpo MIL 2845 151 1.91 20 Gorgui Dieng MIN 2653 95 1.29 Blocks Per 36 Min Watching segments like this one, it's not hard to see how JaVale's impact on both sides of the ball coupled with Golden State's longball wizardry leads to league leading +/ statistics.
Defensive Win Shares
Win Shares is a player statistic which attempts to divvy up credit for team success to the individuals on the team.
RANK PLAYER TEAM MIN DEF WS DEF WS / 36MIN RANK PLAYER TEAM MIN DEF WS DEF WS / 36MIN 1 Draymond Green GSW 2471 4.7 0.0685 11 JaVale McGee GSW 739 1.1 0.0536 2 Patty Mills SAS 1754 3.3 0.0677 12 Anthony Davis NOP 2708 4 0.0532 3 Stephen Curry GSW 2638 4.6 0.0628 13 Jrue Holiday NOP 2190 3.2 0.0526 4 Rudy Gobert UTA 2744 4.7 0.0617 14 Paul Millsap ATL 2343 3.4 0.0522 5 Andre Iguodala GSW 1998 3.4 0.0613 15 LaMarcus Aldridge SAS 2335 3.3 0.0509 6 Klay Thompson GSW 2649 4.5 0.0612 16 Solomon Hill NOP 2374 3.3 0.0500 7 Kevin Durant GSW 2070 3.5 0.0609 17 Andre Roberson OKC 2376 3.3 0.0500 8 James Johnson MIA 2085 3.2 0.0553 18 Jimmy Butler CHI 2809 3.9 0.0500 9 Victor Oladipo OKC 2222 3.4 0.0551 19 DeAndre Jordan LAC 2570 3.5 0.0490 10 Gordon Hayward UTA 2516 3.8 0.0544 20 Kawhi Leonard SAS 2474 3.2 0.0466 Defensive Win Shares Per 36 Min Rebounding
JaVale’s defensive rebound statistics are impressive for a team that defaults it’s rebounds to pace pushing forwards like Green, Iguadala, and Durant; similar to why we don’t see Houston or Oklahoma City bigs on this list. What really stands out is his adjusted OREB per36, which would sit at #2 in the league at 4.87 if he played starters minutes. Many of the statistics we discussed above are due to offensive rebounding. JaVale doesn’t come into the game expecting to be integral in the GS offensive aside from lobs to the rim, instead taking advantage of the top perimeter shooting offensive in the league and the way it pulls opposing defenders to the 3pt line. The Warriors hit quite a lot of their long balls, but when they don’t, JaVale has been there to pull down boards and put them back in with authority. This consistent combination of dominant physical athleticism and mental awareness is a JaVale the league hasn’t seen before.
RANK PLAYER TEAM MIN REB / 36MIN OREB / 36MIN DREB / 36MIN RANK PLAYER TEAM MIN REB / 36MIN OREB / 36MIN DREB / 36MIN 1 Andre Drummond DET 2409 16.66 5.16 11.51 11 Marcin Gortat WAS 2555 11.96 3.35 8.61 2 DeAndre Jordan LAC 2570 15.60 4.17 11.43 12 JaVale McGee GSW 739 11.89 4.87 7.01 3 Hassan Whiteside MIA 2513 15.59 4.20 11.39 13 Anthony Davis NOP 2708 11.75 2.29 9.47 4 Dwight Howard ATL 2199 15.39 4.85 10.54 14 DeMarcus Cousins NOP 2465 11.60 2.22 9.38 5 Rudy Gobert UTA 2744 13.58 4.12 9.46 15 Russell Westbrook OKC 2802 11.10 1.76 9.34 6 Jonas Valanciunas TOR 2066 13.23 3.94 9.29 16 Tristan Thompson CLE 2336 11.02 4.41 6.61 7 Nikola Vucevic ORL 2163 12.97 2.93 10.04 17 Julius Randle LAL 2132 10.74 2.53 8.21 8 Kevin Love CLE 1885 12.72 2.83 9.89 18 Giannis Antetokounmpo MIL 2845 8.86 1.80 7.06 9 Nikola Jokic DEN 2038 12.68 3.74 8.94 19 Gorgui Dieng MIN 2653 8.78 2.55 6.23 10 KarlAnthony Towns MIN 3030 11.96 3.52 8.45 20 LeBron James CLE 2794 8.23 1.25 6.98 Rebounding Per 36 Min Have the Warriors turned JaVale McGee into an elite big man?
To fully answer yes we would need to see consistent starting minutes, something that most likely won’t happen with the current team structure. Did they Warriors solve their supposed rim protection issues and in the process get an offensive juggernaut of a seven footer on a minimum contract? Without a doubt. Stats adjusted for minutes played will always lead to hypothetical conclusions when extrapolated, and being surrounding by a team full of generational talent like Warriors certainly makes everyone look better. What I can say is that Golden State has gotten above and beyond what it expected out of it’s veteran center, a player who nightly has a very tangible positive impact on a team surging towards it’s second title in three years. Through the first round of the playoffs JaVale show's no sign of slowing down in his quest to add even more on to what has been a career year.ARTICLE 2 
Underlying Structure Choices
I want to take a second to talk about some of the choices I have made in regards to the structure of this site moving forward.
1
Underlying Structure ChoicesUnderlying Structure Choices
By Cole Page
Published April 30, 2017
I want to take a second to talk about some of the choices I have made in regards to the structure of this site moving forward. These days there are endless paths to go down depending on what kind of site or blog you are trying to set up. Packages such as Pelican and Hyde make it extreamly easy to write, publish, and push content in a simple, effective, and reproductable way and platforms like Wordpress and GitHub Pages make publishing static content a breeze for beginners and experts alike.
Normally, since I plan to consitantly produce material I would go with one of these templated options, but the point of this site is much more than simply a blog. First off, there are portions of this site that will not be static and that pretty much rules out any of the former options. Second, a large part of this site is about design, not just function, and I want complete control over every line of code and pixel, even if that makes each post creation a little more tedious and drawnout.
........
ARTICLE 1 
Mercurial Analytics:
Iteration ZeroHi there! This is the first article on Mercurial Analytics. It isn't data, analytics, or programming related. This is an introduction to what I do here and what the point of this website is, which although still quite vague and abstract, is starting to coalese into something tangible.
UNFINISHED0
Mercurial Analytics: Iteration ZeroMercurial Analytics:
Iteration ZeroBy Cole Page
Published April 27, 2017
Hello. My name is Cole Page.
ARTICLE 0