[ALWAYS] UNDER CONSTRUCTION...

Learn More

NEXT PROJECT

Who is Jordan Bell?

A Statistical look at the former Oregon Fighting Duck and newest Golden State Warrior, Jordan Bell.

  • [Basketball]
    NBA Prediction Model:
    An Overhead Look at My Deep Learning Ensemble Methodology

    UNDER CONSTRUCTION

    15

    [Basketball]
    NBA Prediction Model:
    An Overhead Look at My Deep Learning Ensemble Methodology

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 15
  • [deeplearning.ai]
    Week 3:
    Shallow Neural Networks

    UNDER CONSTRUCTION

    14

    [deeplearning.ai]
    Week 3:
    Shallow Neural Networks

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 14
  • [deeplearning.ai]
    Week 2:
    Neural Networks Basics

    UNDER CONSTRUCTION

    13

    [deeplearning.ai]
    Week 2:
    Neural Networks Basics

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 13
  • [python/beautifulsoup4]
    Building an Application:
    Live NBA Scoreboard

    UNDER CONSTRUCTION

    12

    [python/beautifulsoup4]
    Building an Application:
    Live NBA Scoreboard in Terminal

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 12
  • [Basketball]
    NBA Team Rankings:
    First Quarter

    UNDER CONSTRUCTION

    11

    [Basketball]
    NBA Team Rankings:
    First Quarter

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 11
  • [basketball]
    How Much Luck do the Irish Have Left?
    A Look Into the Celtics Impressive Win Streak

    UNDER CONSTRUCTION

    10

    [basketball]

    A Look Into the Celtics Impressive Win Streak

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 10
  • [deeplearning.ai]
    Week 1:
    Introduction to Deep Learning

    It has been a few months since I finished Andrew Ng's famous course on Machine Learning. It lived up to it's hype and then some, so naturally when

    9

    [deeplearning.ai]
    Week 1:
    Introduction to Deep Learning

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 9
  • [beautifulsoup4/postgres]
    Scraping NBA Line Movements with BeautifulSoup4

    Here I wanted to take a step backward and walk through how I created the database that was used in my last post where I explored Plotly's Dash platform by creating an dashbaord that displays realtime NBA line movements. The application runs constantly in the background my linux server using cascading crontabs and was written completely in Python using BeautifulSoup4 for the web scraping and Postgres queries to store the data.

    8

    [beautifulsoup4/postgres]
    Scraping NBA Line Movements with BeautifulSoup4

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 8
  • [Plotly/Dash]
    Dash Framework Exploration:
    Live NBA Line Movements

    Always in search of new data viszualization methods and tools, I recently came across Dash, a Python framework built on top of Plotly.js, React, and Flask with the purpose of building analytical web applications completely in Python. Dash works as the frontend to our analytical Python backend. In order to get a feel for the platform I decided to build out a sample application that uses many of Dash's features.

    7

    [Plotly/Dash]
    Dash Framework Exploration:
    Live NBA Line Movements

    What is Dash?

    Dash is an Open Source Python library for creating reactive, Web-based applications. Dash makes it extreamly easy to build a GUI or live visualization around your data analysis code, all without stepping outside of the Python language. Dash apps are viewed in the web browser without the need for any user built javascript or HTML, instead using a growing set of interactive web-based components to bind custom data analysis in Python to your Dash user interface.

    Dash applications are web servers running Flask and communicating JSON packets over HTTP requests. Dash’s frontend renders components using React.js, the Javascript user-interface library written and maintained by Facebook. As Plotly puts it: "Dash leverages the power of Flask and React, putting them to work for Python data scientists who may not be expert Web programmers."

    Dash components are Python classes that encode the properties and values of a specific React component and that serialize as JSON. While Dash ships with a large array of components that are easy to use out of the box, you are not limited to them alone. It is easy to port in your own React.js components for use in Dash applications, leaving the possibilities pretty endless.

    Charts are rendered with plotly.js (on top of D3.js and WebGL) sharing the same libraries and syntax, allowing developers to write Dash applications with the same functionality as any Plotly chart. Check out the Dash User Guide to learn more!

    Sample Application:

    Per usual, since I learn better by doing than simply reading, I decided to build out my own sample Dash application as a crash course into the framework. I tried to incorporate as many of the core Dash components as possible with a main focus on exploring the Live Updates module.

    Recently, I built a scraping/database system for the daily collection and storage of lines and odds movements for NBA games. (See article on Postgres and BeautifulSoup4 for more details) Since I have this database growing away in the background I figured why not use the data it's collecting as the food for this sample Dash Application!

    The goal of this application was to provide a Live Updating Dashboard that displays real-time betting line movements for NBA games. The initial goal was to simply pull live data from my database and display second by second updates for each team's spread and total line movements.

    I'm going to write a bit more about the development process, issues, things to fix, additions, etc... But for now here is a link to the working application. At the moment it runs pretty slow as the database queries are pinging off my home server, none of it is opitimized for real web traffic, and the hosting is using a free Heroku Deployment server. Thus at the moment, on your system it may take a long time to update components or it may not load at all.

    DASHBOARD: https://nba-line-movements.herokuapp.com/

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 7
  • [Basketball]
    The Atrocious Defense of the Cleveland Cavaliers
    And How LeBron James Isn't Helping

    It's getting late here on the east coast and I just finished watching the defending Eastern Conference Champions rack up their 7th loss of the season and 10th straight game giving up at least 110 points, ratcheting up their Defensive Effiency to an absurdly pourus 113.0, good for the worst in the NBA and on pace for one of the worst of all time. Of course we are only 1/7th into the season, and per-usual the defensive sinkholes that are JR Smith and Kevin Love continue to show zero effort on the defensive end, but there is a bigger and much more important reason why the Cavs are such a joke on defense: LeBron James. In this post I want to walk through the early-season disaster that is Cleveland's defensive effort and see if this is a bit more worrying than the standard "not trying before April" excuse the Cavs always seem to get.

    6

    [Basketball]
    The Atrocious Defense of the Cleveland Cavaliers
    November 9th 2017

    The Atrocious Defense of the Cleveland Cavaliers
    And How LeBron James Isn't Helping

    It's getting late here on the east coast and I just finished watching the defending Eastern Conference Champions rack up their 7th loss of the season and 10th straight game giving up at least 110 points, ratcheting up their Defensive Effiency to an absurdly pourus 113.0, good for the worst in the NBA and on pace for one of the worst of all time. Of course we are only 1/7th into the season, and per-usual the defensive sinkholes that are JR Smith and Kevin Love continue to show zero effort on the defensive end, but there is a bigger reason why the Cavs are such a joke on defense: LeBron James. Yes, that LeBron James, the best player in the NBA with the self proclaimed ability to play every position on the floor at the highest level. Sure, this might still be true on the offensive side of the floor, but using your most recent "agressive LeBron trail-block" video as a metric for his defensive prowess shouldn't be enough to fool anyone with half a brain and their eyes open.

    I'm starting to let my anti-LeBron feelings take over a little here so I'll cool it for a second and go back to Celevland's numerous other issues on the defensive end.

    So far this season Cleveland is giving up a league worst 5 more points per 100 possessions (108-113) compared to last season, not a good sign for a defense that was already dropping off in the second half of last season. Cleveland is currently 29th in Opponent Field Goal Percentage as well as 23rd in Blocks and 29th in Steals, showing a complete lack of individual effort across the board.

    The only caveat through these first 12 games, is that I truly believe a majority of these defensive woes come a lack of effort vs. a lack of ability, which makes turning them around a much easier task. If you've any of these early games a recurring theme is the atrocious lack of energy at the start of games. The games they have won have usually involved LBJ taking over late game and barely eeking out a win, but Cleveland currently ranks dead last in both 1st half and 1st quarter points allowed, giving up an astounding 2 points more than the next worst team in the Lakers. I get it, you guys are really really old and the energy just isnt there but if you want any chance of getting back to the Finals this year, pop a few redbulls at shootaround and stop sleeping through the first half or you are going to keep giving up 110+ points to the bottom of the east.

    Okay, back to LBJ! There are a lot of reasons why LeBron has seemingly dropped off so precipitisly on the defensive end, and personally I think all have them come down to effort and lack of killer mentality, the one thing that FOR ME will always keep him out of the GOAT conversation. Jordan, Kobe, Bird, Magic, they had it in droves whether it was the 12th game of the season or game 7 of the finals. LeBron is the most athletically gifted, dominant player this league has probably ever seen, and yet he acts absurdly soft constantly. That's all there is to it.

    Opinions aside let's look at a few statistics. With all his superhuman size and athleticism LeBron has given up 181 FG inside the restricted area this year at a 66.5% clip, good for 2nd in the ENTIRE nba only 4 behind Thaddeus Young. One could argue that his league 2nd in Minutes (4 behind DeMarcus Cousins) and atrocious lack of help defense play a big factor, but when you just stand there, watch, and make this little effort down 6 with 6 to play as the first man back on the break, you don't get an excuse.

    Among other telling statistics in the defensive liability that is early season 2017 LeBron James is Opponent Points off Turnovers (effort, effort, effort!) where he ranks 429th out of 430 players, only 3 points behind defensive virtuoso James Harden. Add that to 414/430 for Opp 2nd Chance points, 410/430 in Opp Fast Break Points, and 428/430 in Opp Points in the Paint and you get a league bottom 5 worst Defensive Win Share of -0.2, yes that's NEGATIVE as in "the King" personally detracts from the leagues already worst defense. Granted his fellow Cavaliers JR Smith, Kevin Love, and Derrick Rose also round out the top 5 (with Dallas' Wesley Matthews) but you expect the best player on the planet to at least not make things worse. I mean, no disrespect to Taurean Prince but you're LeBron James, there is 30seconds left in the quarter, you are losing to the Atlanta Hawks and because of your "unreal basketball IQ" you know they are going for the 2 for 1 and you just do everything in your power to let them....

    I might be a hater, but I'm not a total asshole. The Cavs are only 12 games into a season where they brought in 10 new players, lost 8, and only kept 7 so a little adjustment time is warranted, but waiting on IT's return to fix this Defense is not going to cut it. If things are going to change, and I truley think they will, LeBron needs to put forth some effort, plain and simple.

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 6
  • [CS231n]
    E1Q1: KNN Classifiers

    This morning I want to test out my method for integrating ipython notebook work into blog posts while simultaniously brushing a little rust off my understanding of Convolutional Neural Nets. Lets do this by working through the first exercise in Stanford's CS231n: Convolutional Neural Networks for Visual Recognition class. Much of the material is familiar to me but a lot of the application is new as I have not worked extensivly on applying ML to imaging problems so it seems like a good way to get the gears turning!

    5

    [CS231n]
    E1Q1: K-Nearest Neighbor Classifier

    This morning I want to test out my method for integrating ipython notebook work into blog posts while simultaniously brushing a little rust off my understanding of Convolutional Neural Nets. Lets do this by working through the first exercise in Stanford's CS231n: Convolutional Neural Networks for Visual Recognition class. Much of the material is familiar to me but a lot of the application is new as I have not worked extensivly on applying ML to imaging problems so it seems like a good way to get the gears turning!


    Q1: K-Nearest Neighbor Classifier:

    In this first part of assignment #1 we practice putting together a simple image classification pipeline, using a k-Nearest Neighbor classifier.

    In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. In this assignment we utilize a very basic implementation to predict image labels in the CIFAR-10 Dataset.

    The kNN classifier consists of two stages:

    - During training, the classifier takes the training data and simply remembers it

    - During testing, kNN classifies every test image by comparing to all training images and transfering the labels of the k most similar training examples

    - The value of k is cross-validated

    In the following we will implement these steps to better understand the basic Image Classification pipeline, cross-validation, and gain proficiency in writing efficient, vectorized code.

    INITIAL SETUP

    Q1: k-Nearest Neighbor classifier

    In [1]:
    # Run some setup code for this notebook.
    
    import random
    import numpy as np
    from cs231n.data_utils import load_CIFAR10
    import matplotlib.pyplot as plt
    
    from __future__ import print_function
    
    # This is a bit of magic to make matplotlib figures appear inline in the notebook
    # rather than in a new window.
    %matplotlib inline
    plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
    plt.rcParams['image.interpolation'] = 'nearest'
    plt.rcParams['image.cmap'] = 'gray'
    
    # Some more magic so that the notebook will reload external python modules;
    # see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
    %load_ext autoreload
    %autoreload 2
    
    In [2]:
    # Load the raw CIFAR-10 data.
    cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
    
    # As a sanity check, we print out the size of the training and test data.
    print('Training data shape: ', X_train.shape)
    print('Training labels shape: ', y_train.shape)
    print('Test data shape: ', X_test.shape)
    print('Test labels shape: ', y_test.shape)
    
    Training data shape:  (50000, 32, 32, 3)
    Training labels shape:  (50000,)
    Test data shape:  (10000, 32, 32, 3)
    Test labels shape:  (10000,)
    
    In [3]:
    # Visualize some examples from the dataset.
    # We show a few examples of training images from each class.
    classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
    num_classes = len(classes)
    samples_per_class = 7
    for y, cls in enumerate(classes):
        idxs = np.flatnonzero(y_train == y)
        idxs = np.random.choice(idxs, samples_per_class, replace=False)
        for i, idx in enumerate(idxs):
            plt_idx = i * num_classes + y + 1
            plt.subplot(samples_per_class, num_classes, plt_idx)
            plt.imshow(X_train[idx].astype('uint8'))
            plt.axis('off')
            if i == 0:
                plt.title(cls)
    plt.show()
    
    In [4]:
    # Subsample the data for more efficient code execution in this exercise
    num_training = 5000
    mask = list(range(num_training))
    X_train = X_train[mask]
    y_train = y_train[mask]
    
    num_test = 500
    mask = list(range(num_test))
    X_test = X_test[mask]
    y_test = y_test[mask]
    
    In [5]:
    # Reshape the image data into rows
    X_train = np.reshape(X_train, (X_train.shape[0], -1))
    X_test = np.reshape(X_test, (X_test.shape[0], -1))
    print(X_train.shape, X_test.shape)
    
    (5000, 3072) (500, 3072)
    
    In [6]:
    from cs231n.classifiers import KNearestNeighbor
    
    # Create a kNN classifier instance. 
    # Remember that training a kNN classifier is a noop: 
    # the Classifier simply remembers the data and does no further processing 
    classifier = KNearestNeighbor()
    classifier.train(X_train, y_train)
    

    We would now like to classify the test data with the kNN classifier. Recall that we can break down this process into two steps:

    1. First we must compute the distances between all test examples and all train examples.

    2. Given these distances, for each test example we find the k nearest examples and have them vote for the label

    Lets begin with computing the distance matrix between all training and test examples. For example, if there are Ntr training examples and Nte test examples, this stage should result in a Nte x Ntr matrix where each element (i,j) is the distance between the i-th test and j-th train example.

    First, open `cs231n/classifiers/k_nearest_neighbor.py` and implement the function `compute_distances_two_loops` that uses a (very inefficient) double loop over all pairs of (test, train) examples and computes the distance matrix one element at a time.

    We would now like to classify the test data with the kNN classifier. Recall that we can break down this process into two steps:

    1. First we must compute the distances between all test examples and all train examples.
    2. Given these distances, for each test example we find the k nearest examples and have them vote for the label

    Lets begin with computing the distance matrix between all training and test examples. For example, if there are Ntr training examples and Nte test examples, this stage should result in a Nte x Ntr matrix where each element (i,j) is the distance between the i-th test and j-th train example.

    First, open cs231n/classifiers/k_nearest_neighbor.py and implement the function compute_distances_two_loops that uses a (very inefficient) double loop over all pairs of (test, train) examples and computes the distance matrix one element at a time.

    In [7]:
    # Open cs231n/classifiers/k_nearest_neighbor.py and implement
    # compute_distances_two_loops.
    
    # Test your implementation:
    dists = classifier.compute_distances_two_loops(X_test)
    print("Shape: " + str(dists.shape))
    
    Computing Distance:  0
    Computing Distance:  25
    Computing Distance:  50
    Computing Distance:  75
    Computing Distance:  100
    Computing Distance:  125
    Computing Distance:  150
    Computing Distance:  175
    Computing Distance:  200
    Computing Distance:  225
    Computing Distance:  250
    Computing Distance:  275
    Computing Distance:  300
    Computing Distance:  325
    Computing Distance:  350
    Computing Distance:  375
    Computing Distance:  400
    Computing Distance:  425
    Computing Distance:  450
    Computing Distance:  475
    Shape: (500, 5000)
    
    In [23]:
    # We can visualize the distance matrix: each row is a single test example and
    # its distances to training examples
    plt.imshow(dists, interpolation='none')
    plt.show()
    

    INLINE QUESTION #1:

    1. What in the data is the cause behind the distinctly bright rows?

    The rows represent each test image and every column in that row is it's similarity to the 5000 training images. If the pixel for row i column j is white then we see high distances, or rather no similarity between test and training image. Thus a row (test image) that is distinctly bright means that the test image has very few similar images in the entire training set.

    2. What causes the columns?

    Similar to the explanation above, the white columns represent images in the training set that are similar to very few of the test images, as most of the 0-500 test images show large L2 distances (bright).

    INLINE QUESTION #1:

    1. The rows represent each test image and every column in that row is it's similarity to the 5000 training images. If the pixel for row i column j is white then we see high distances, or rather no similarity between test and training image. Thus a row (test image) that is distinctly bright means that the test image has very few similar images in the entire training set.

    2. Similar to the explanation above, the white columns represent images in the training set that are similar to very few of the test images, as most of the 0-500 test images show large L2 distances (bright).

    In [42]:
    # Now implement the function predict_labels and run the code below:
    # We use k = 1 (which is Nearest Neighbor).
    y_test_pred = classifier.predict_labels(dists, k=1)
    
    # Compute and print the fraction of correctly predicted examples
    num_correct = np.sum(y_test_pred == y_test)
    accuracy = float(num_correct) / num_test
    print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
    
    Got 137 / 500 correct => accuracy: 0.274000
    
    In [109]:
    y_test_pred = classifier.predict_labels(dists, k=5)
    num_correct = np.sum(y_test_pred == y_test)
    accuracy = float(num_correct) / num_test
    print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
    
    Got 145 / 500 correct => accuracy: 0.290000
    
    In [46]:
    # Now lets speed up distance matrix computation by using partial vectorization
    # with one loop. Implement the function compute_distances_one_loop and run the
    # code below:
    dists_one = classifier.compute_distances_one_loop(X_test)
    
    Computing Distance:  0
    Computing Distance:  25
    Computing Distance:  50
    Computing Distance:  75
    Computing Distance:  100
    Computing Distance:  125
    Computing Distance:  150
    Computing Distance:  175
    Computing Distance:  200
    Computing Distance:  225
    Computing Distance:  250
    Computing Distance:  275
    Computing Distance:  300
    Computing Distance:  325
    Computing Distance:  350
    Computing Distance:  375
    Computing Distance:  400
    Computing Distance:  425
    Computing Distance:  450
    Computing Distance:  475
    
    In [47]:
    # To ensure that our vectorized implementation is correct, we make sure that it
    # agrees with the naive implementation. There are many ways to decide whether
    # two matrices are similar; one of the simplest is the Frobenius norm. In case
    # you haven't seen it before, the Frobenius norm of two matrices is the square
    # root of the squared sum of differences of all elements; in other words, reshape
    # the matrices into vectors and compute the Euclidean distance between them.
    difference = np.linalg.norm(dists - dists_one, ord='fro')
    print('Difference was: %f' % (difference, ))
    if difference < 0.001:
        print('Good! The distance matrices are the same')
    else:
        print('Uh-oh! The distance matrices are different')
    
    Difference was: 0.000000
    Good! The distance matrices are the same
    
    In [59]:
    # Now implement the fully vectorized version inside compute_distances_no_loops
    # and run the code
    dists_two = classifier.compute_distances_no_loops(X_test)
    
    # check that the distance matrix agrees with the one we computed before:
    difference = np.linalg.norm(dists - dists_two, ord='fro')
    print('Difference was: %f' % (difference, ))
    if difference < 0.001:
        print('Good! The distance matrices are the same')
    else:
        print('Uh-oh! The distance matrices are different')
    
    Difference was: 0.000000
    Good! The distance matrices are the same
    
    In [53]:
    # Let's compare how fast the implementations are
    def time_function(f, *args):
        """
        Call a function f with args and return the time (in seconds) that it took to execute.
        """
        import time
        tic = time.time()
        f(*args)
        toc = time.time()
        return toc - tic
    
    two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)
    print('Two loop version took %f seconds' % two_loop_time)
    
    one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)
    print('One loop version took %f seconds' % one_loop_time)
    
    no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)
    print('No loop version took %f seconds' % no_loop_time)
    
    # you should see significantly faster performance with the fully vectorized implementation
    
    FINISHED WITH TEST IMAGE:  0
    FINISHED WITH TEST IMAGE:  50
    FINISHED WITH TEST IMAGE:  100
    FINISHED WITH TEST IMAGE:  150
    FINISHED WITH TEST IMAGE:  200
    FINISHED WITH TEST IMAGE:  250
    FINISHED WITH TEST IMAGE:  300
    FINISHED WITH TEST IMAGE:  350
    FINISHED WITH TEST IMAGE:  400
    FINISHED WITH TEST IMAGE:  450
    Two loop version took 32.644345 seconds
    FINISHED WITH TEST IMAGE:  0
    FINISHED WITH TEST IMAGE:  50
    FINISHED WITH TEST IMAGE:  100
    FINISHED WITH TEST IMAGE:  150
    FINISHED WITH TEST IMAGE:  200
    FINISHED WITH TEST IMAGE:  250
    FINISHED WITH TEST IMAGE:  300
    FINISHED WITH TEST IMAGE:  350
    FINISHED WITH TEST IMAGE:  400
    FINISHED WITH TEST IMAGE:  450
    One loop version took 75.630863 seconds
    No loop version took 0.413558 seconds
    

    CROSS VALIDATION

    We have implemented the k-Nearest Neighbor classifier but we set the value k = 5 arbitrarily. We will now determine the best value of this hyperparameter with cross-validation.

    In [142]:
    num_folds = 5
    k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]
    
    X_train_folds = []
    y_train_folds = []
    
    #use numpy array_split function to split the training data into num_folds folds
    X_train_folds = np.array_split(X_train, num_folds)
    y_train_folds = np.array_split(y_train, num_folds)
    
    # A dictionary holding the accuracies for different values of k that we find
    # when running cross-validation. After running cross-validation,
    # k_to_accuracies[k] should be a list of length num_folds giving the different
    # accuracy values that we found when using that value of k.
    k_to_accuracies = {}
    
    for k in k_choices:
        k_to_accuracies[k] = []
        # Run kNN algorithm num_folds times
        for i in range(num_folds):
            X_train = []
            y_train = []
            for j in range(num_folds):
                if i != j:
                    X_train.extend(X_train_folds[j])
                    y_train.extend(y_train_folds[j])
    
            X_train = np.array(X_train)
            y_train = np.array(y_train)
            classifier = KNearestNeighbor()
            classifier.train(X_train, y_train)
            dists = classifier.compute_distances_no_loops(X_test)
            y_test_pred = classifier.predict_labels(dists, k=k)
    
            num_correct = np.sum(y_test_pred == y_test)
            accuracy = float(num_correct) / num_test
    
            k_to_accuracies[k].append(accuracy)
    
    # Print out the computed accuracies
    for k in sorted(k_to_accuracies):
        for accuracy in k_to_accuracies[k]:
            print('k = %d, accuracy = %f' % (k, accuracy))
    
    k = 1, accuracy = 0.258000
    k = 1, accuracy = 0.276000
    k = 1, accuracy = 0.260000
    k = 1, accuracy = 0.250000
    k = 1, accuracy = 0.254000
    k = 3, accuracy = 0.276000
    k = 3, accuracy = 0.280000
    k = 3, accuracy = 0.262000
    k = 3, accuracy = 0.272000
    k = 3, accuracy = 0.252000
    k = 5, accuracy = 0.284000
    k = 5, accuracy = 0.294000
    k = 5, accuracy = 0.272000
    k = 5, accuracy = 0.268000
    k = 5, accuracy = 0.280000
    k = 8, accuracy = 0.280000
    k = 8, accuracy = 0.282000
    k = 8, accuracy = 0.282000
    k = 8, accuracy = 0.250000
    k = 8, accuracy = 0.290000
    k = 10, accuracy = 0.274000
    k = 10, accuracy = 0.286000
    k = 10, accuracy = 0.278000
    k = 10, accuracy = 0.260000
    k = 10, accuracy = 0.270000
    k = 12, accuracy = 0.282000
    k = 12, accuracy = 0.266000
    k = 12, accuracy = 0.272000
    k = 12, accuracy = 0.276000
    k = 12, accuracy = 0.280000
    k = 15, accuracy = 0.278000
    k = 15, accuracy = 0.270000
    k = 15, accuracy = 0.250000
    k = 15, accuracy = 0.262000
    k = 15, accuracy = 0.270000
    k = 20, accuracy = 0.274000
    k = 20, accuracy = 0.254000
    k = 20, accuracy = 0.242000
    k = 20, accuracy = 0.258000
    k = 20, accuracy = 0.274000
    k = 50, accuracy = 0.240000
    k = 50, accuracy = 0.234000
    k = 50, accuracy = 0.234000
    k = 50, accuracy = 0.246000
    k = 50, accuracy = 0.234000
    k = 100, accuracy = 0.230000
    k = 100, accuracy = 0.218000
    k = 100, accuracy = 0.224000
    k = 100, accuracy = 0.224000
    k = 100, accuracy = 0.224000
    
    In [143]:
    # plot the raw observations
    for k in k_choices:
        accuracies = k_to_accuracies[k]
        plt.scatter([k] * len(accuracies), accuracies)
    
    # plot the trend line with error bars that correspond to standard deviation
    accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])
    accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])
    plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)
    plt.title('Cross-validation on k')
    plt.xlabel('k')
    plt.ylabel('Cross-validation accuracy')
    
    plt.show()
    
    In [152]:
    # Based on the cross-validation results above, choose the best value for k,   
    # retrain the classifier using all the training data, and test it on the test
    # data. You should be able to get above 28% accuracy on the test data.
    best_k = 8
    
    classifier = KNearestNeighbor()
    classifier.train(X_train, y_train)
    y_test_pred = classifier.predict(X_test, k=best_k)
    
    # Compute and display the accuracy
    num_correct = np.sum(y_test_pred == y_test)
    accuracy = float(num_correct) / num_test
    print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))
    
    Got 145 / 500 correct => accuracy: 0.290000
    

    K-NEAREST NEIGHBOR CLASS

    The following code contains the kNN class and it's containing functions that I use in the above analysis.

    In [ ]:
    import numpy as np
    from past.builtins import xrange
    from collections import Counter
    
    class KNearestNeighbor(object):
        """ a kNN classifier with L2 distance """
    
        def __init__(self):
            pass
    
        def train(self, X, y):
            """
            Train the classifier. For k-nearest neighbors this is just 
            memorizing the training data.
    
            Inputs:
            - X: A numpy array of shape (num_train, D) containing the training data
              consisting of num_train samples each of dimension D.
            - y: A numpy array of shape (N,) containing the training labels, where
                 y[i] is the label for X[i].
            """
            self.X_train = X
            self.y_train = y
    
        def predict(self, X, k=1, num_loops=0):
            """
            Predict labels for test data using this classifier.
    
            Inputs:
            - X: A numpy array of shape (num_test, D) containing test data consisting
                 of num_test samples each of dimension D.
            - k: The number of nearest neighbors that vote for the predicted labels.
            - num_loops: Determines which implementation to use to compute distances
              between training points and testing points.
    
            Returns:
            - y: A numpy array of shape (num_test,) containing predicted labels for the
              test data, where y[i] is the predicted label for the test point X[i].  
            """
            if num_loops == 0:
                dists = self.compute_distances_no_loops(X)
            elif num_loops == 1:
                dists = self.compute_distances_one_loop(X)
            elif num_loops == 2:
                dists = self.compute_distances_two_loops(X)
            else:
                raise ValueError('Invalid value %d for num_loops' % num_loops)
    
            return self.predict_labels(dists, k=k)
    
        def compute_distances_two_loops(self, X):
            """
            Compute the distance between each test point in X and each training point
            in self.X_train using a nested loop over both the training data and the 
            test data.
    
            Inputs:
            - X: A numpy array of shape (num_test, D) containing test data.
    
            Returns:
            - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
              is the Euclidean distance between the ith test point and the jth training
              point.
            """
    
            num_test = X.shape[0]
            num_train = self.X_train.shape[0]
            dists = np.zeros((num_test, num_train))
            for i in xrange(num_test):
                if i%50 == 0:
                    print("FINISHED WITH TEST IMAGE: ",i)
                for j in xrange(num_train):
                    dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
    
            return dists
    
        def compute_distances_one_loop(self, X):
            """
            Compute the distance between each test point in X and each training point
            in self.X_train using a single loop over the test data.
    
            Input / Output: Same as compute_distances_two_loops
            """
            num_test = X.shape[0]
            num_train = self.X_train.shape[0]
            dists = np.zeros((num_test, num_train))
    
            for i in xrange(num_test):
    
                if i%50 == 0:
                    print("FINISHED WITH TEST IMAGE: ",i)
                dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis=1))
    
            return dists
    
        def compute_distances_no_loops(self, X):
            """
            Compute the distance between each test point in X and each training point
            in self.X_train using no explicit loops.
    
            Input / Output: Same as compute_distances_two_loops
            """
            num_test = X.shape[0]
            num_train = self.X_train.shape[0]
            dists = np.zeros((num_test, num_train))
    
    
            # (x-y)^2 = x^2 + y^2 - 2xy --> test_sum + train_sum - 2*inner_product
            test_sum = np.sum(np.square(X), axis=1) # shape -> (500,)
            train_sum = np.sum(np.square(self.X_train), axis=1) # shape -> (5000,)
            inner_product = np.dot(X, self.X_train.T) # shape -> (500,5000)
    
            # reshape test_sum from (500,) to (500,1) while keeping same data
            # the -1 infers same shape as before (500)
            dists = np.sqrt(test_sum.reshape(-1, 1) + train_sum - 2*inner_product)
    
            return dists
    
        def predict_labels(self, dists, k=1):
            """
            Given a matrix of distances between test points and training points,
            predict a label for each test point.
    
            Inputs:
            - dists: A numpy array of shape (num_test, num_train) where dists[i, j]
              gives the distance betwen the ith test point and the jth training point.
    
            Returns:
            - y: A numpy array of shape (num_test,) containing predicted labels for the
              test data, where y[i] is the predicted label for the test point X[i].  
            """
            num_test = dists.shape[0]
            y_pred = np.zeros(num_test)
            for i in xrange(num_test):
                # A list of length k storing the labels of the k nearest neighbors to
                # the ith test point.
                closest_y = []
    
                #while looping through i's get the distances of each images (i) with every
                #training image and save that as a new numpy array "dists_i"
                dists_i = dists[i]
    
                #dists_i.argsort() gives the indices of sorted distances, low to high.
                #dists_i.argsort()[:k], gives k lowest distance indices (k Nearest Neighbors)
                #y_train["this lowest distance indice"] gives the labels of that training img
                #this array closest_y is of len=k
                closest_y = self.y_train[dists_i.argsort()[:k]]
    
                #choose the most common label in closest_y
                #(ties broken with lowest label...Counter does this....)
                y_pred[i] = Counter(closest_y).most_common(1)[0][0]
    
    
            return y_pred
    

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 5
  • [Notes]
    Embedding Jupyter Notebooks

    Jupyter Notebooks are one of the more constant tools I use in my workflow so I want to quickly go through my process of converting them to embedded HTML that is usuable in these posts moving forward. For this I have decided to use nbconvert, a Jupyter Notebook conversion tool, built on top of Jinja that allows researchers/scientists to quickly and easily deliver results across a number of static mediums.

    4

    [Notes]
    Embedding Jupyter Notebooks

    As discussed in my post on underlying site structure, I have opted to build out every piece of this site from scratch instead of working with static-page generators such as Pelican or Hyde. I discuss the logic behind this choice there, so I wont reiterate, but due to this choice I need to set up processes for displaying work moving forward. Jupyter Notebooks are one of the more constant tools I use in my workflow so I want to quickly go through my process of converting them to embedded HTML that is usuable in these posts moving forward.

    For this I have decided to use nbconvert, a Jupyter Notebook conversion tool, built on top of Jinja that allows researchers/scientists to quickly and easily deliver results across a number of static mediums such as PDF, LaTeX, and HTML. In this site I will be utilizing the HTML output format, which allows you to tweak the templating and customize the HTML static rendering.

    Nbconvert uses two primary methods of conversion, first a command line script that inputs .ipynb files and outputs the desired static format, and second programmatically as a Python library. The latter is fantastic for use with static site generators such as Pelican since it works in memory to dynamically convert notebooks inside a publishing pipeline without ever having to read or write from disk. That being said, since I have opted for a non-static blog where I force myself to code every piece from scratch, I will be using the command line method to generate static HTML which I then manually fold into each article's PHP file. While this might not be very practicle in a production environment, it works great for the purpose of this platform.

    Installation is as simple as a conda or pip package install, and for basic ipynb -> html conversion no other packages are required. For other format conversion nbconvert uses Pandoc or your OS's TeX distribution.

    Command line usage is very simple and the basic format is as follows:

    $ jupyter nbconvert --to FORMAT notebook.ipynb

    The default output format is HTML, for which the --to argument may be omitted, so implementation gets even easier:

    $ jupyter nbconvert notebook.ipynb

    As I stated earlier, one of the great aspects of nbconvert is the ability to use templates by adding the --template argument to the above command. Jupyter provides a few templates, --template full which produces a full static HTML render that looks very similar to the interactive view and --template basic which produces a simplified version for basic webpage/blog embedding.

    For now I will be using the --template basic for my posts with a few simple css changes to hide or show functionality. The two main additions are hiding the display of the input prompts and some added margin to the rendered boxes, simple stuff. The basic template is simple, clear, and displays everything well enough for the time being. Down the road I plan on creating a custom template and when I do I will document my thought process. In addition, prior to a custom template, it's easy to simple add css properties to any rendered HTML from the basic template to make small changes as you go.

    The next and last step is to include the newly created static HTML and it is very simple. If you wanted you could just open the created HTML file and copy/paste the code into the given article PHP file, but this is sloppy. For larger notebooks the rendered static code can be hundreds lines of not particularly aesthetic code and just dropping it in makes editing and making changes to each article a mess. Instead you can just use a w3-include-html attribute and a tiny bit of javascript.

    <html>
    <script src="https://www.w3schools.com/lib/w3.js"></script>
    <body>

    <div w3-include-html="content.html"></div>

    <script>
    w3.includeHTML();
    </script>

    </body>
    </html>

    As the sample above shows, this is a very quick couple of lines of code, and keeps the messy nbconvert output hidden away in it's own space. This also allows multiple includes for long articles to be seemlessly added in and around the article's HTML and PHP. Make sure you have the w3.js script included, then throw in the w3 attribute div wherever you want the notebook placed, and make sure to include the w3.includeHTML() call. That's all there is to it!






    For more information on nbconvert check out their docs: https://nbconvert.readthedocs.io/en/latest/#

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 4
  • [Notes]
    Moving Apartments!

    Next week I'm moving into a new apartment so I will probably be a little bit dormant here for the next month or two while I get settled. See ya in a bit!

    3

    [Notes]
    Moving Apartments!

    Don't get used to posts like this, I truley abhor blogging in general and am going to keep this site to analtical, academic, and sports musings not self-obsessed posts about the meaningless (to you) things I did over the preceding days. I am not nearly talented enough of a writer to make narcicisstic millenial ramblings appealing to anyone who doesn't list a social media platform as their primary occupation. Anyway....

    Big week next week over here in Philadelphia! Leaving the comfy confines of my little studio apartment and moving in with a SO for the first time! Been practicing putting the seat down for weeks now so dont worry, I'm pretty sure I'm completely ready for this change. That's all there is to it right? We found an awesome two-bedroom up in Fishtown, which is a rapidly growing area of northern Philly. It's still a few years away from fully blowing up but it kind of feels like Phillys much smaller Brooklyn circa 2003 (I think?? I truley have no clue, I was 13 and in California in 2003 but I've heard people say this).

    The space is amazing, and two bedrooms means I get to have my first real office that's not within a foot of my bed, woohoo! We have about 20 times more space than furnature and belongings so I have a feeling it's going to take a lot of craigslist diving and Ikea trips to feel even remotely moved in. We also have an awesome little real grass dog park on the roof of the building, so fingers crossed it's finally almost time for another member of the family!

    See ya in a few....

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 3
  • [Basketball]
    JaVale By the Numbers

    Just prior to the start of the 2016-2017 NBA season when the big-man depleted Golden State Warriors signed JaVale Magee to a 1-year minimum contract, the critics quickly surfaced. Articles bashing JaVale we’re numerous, and praise nonexistent. Looking at his raw averages of 6.1 points, 3.2 rebounds, and 0.9 blocks in only 9.6 minutes per game won’t blow you away, but his per-36-minute stats are absurd. In this article we take a look at just how impressive a starting full-minutes JaVale would rank up against the rest of the league.

    2

    [Basketball]
    JAVALE BY THE NUMBERS
    May 3rd 2017

    JaVale By the Numbers

    Just prior to the start of the 2016-2017 NBA season when the big-man depleted Golden State Warriors signed JaVale Magee to a 1-year minimum contract, the critics quickly surfaced. Articles bashing JaVale we’re numerous, and praise nonexistent. An article on Complex ranked JaVale as the 12th worst player in the NBA and wrote “The Golden State Warriors need help defending the paint so they’d figure, ‘Hey, JaVale is a giant human with a heartbeat, let’s sign him!’ Problem is he really isn’t that good. Maybe the organization of the future will somehow turn JaVale into an elite big man, or maybe they’re just desperate for some rim protection.”

    Well, seven months later, four wins into the 2017 playoffs, it looks like the Golden State organization may have done exactly that. Not only did JaVale produce important big-man minutes off the bench all season, he started ten games in place of the injured Zaza Pachulia and put together one of the most impressive per-minute statistical seasons for a center in recent memory. JaVale only averaged 9.6 minutes a game and much of his statistical dominance may be attributed to the strength of the league-leading team surrounding him, but anyone who actually watched this season got a chance to a see a whole new JaVale: a powerfully athletic, rim destroying, ball swatting monster having a career year.

    Looking at his raw averages of 6.1 points, 3.2 rebounds, and 0.9 blocks in only 9.6 minutes per game won’t blow you away, but his per-36-minute stats are absurd. In this article we take a look at just how impressive a starting full-minutes JaVale would rank up against the rest of the league. I’m not suggesting that increasing his play time would effect his stats linearly like this, and with Golden State’s depth and small-ball play we most likely aren’t going to find out any time soon, but the Warriors are without a doubt playing some of their best basketball when he's on the floor, and so far JaVale has silenced most if not all of his critics; looking at you Shaq.

    Shooting

    Dunks

    Of all the stats I am going to talk about today, this one should be the least surprising and is without a doubt the most dominant. Through all his blunders and mistakes over the years, JaVale has never failed to thrown down with authority. At 7’ tall and sporting an 8’ wingspan, there aren’t many people on the planet who have that kind of range around the rim. Hometown bias and NBA sponsorship (KIA) aside, JaVale probably should have won the 2011 Slam Dunk Contest, and annually puts up highlight reel dunks. With a fraction of the minutes played, JaVale still ranked 12th in total dunks this season. Adjusted, JaVale is averaging 5.89 blocks/36min, a whopping 2.35 more than this seasons dunk leader DeAndre Jordan. The Clippers might still hold the title of Lob City but nobody has dominated the Ally-Oop this season like Draymond, Andre, and Steph chucking up bombs to JaVale McGee.

    RANK PLAYER TEAM MINUTES DUNKS Dunks / 36min RANK PLAYER TEAM MINUTES DUNKS Dunks / 36min
    1 JaVale McGee GSW 739 121 5.89 11 Marquese Chriss PHX 1743 103 2.13
    2 Clint Capela HOU 1551 163 3.78 12 Jabari Parker MIL 1728 92 1.92
    3 DeAndre Jordan LAC 2570 253 3.54 13 Tristan Thompson CLE 2336 122 1.88
    4 Montrezl Harrell HOU 1064 98 3.32 14 LeBron James CLE 2794 145 1.87
    5 Dwight Howard ATL 2199 199 3.26 15 Kevin Durant GSW 2070 107 1.86
    6 Rudy Gobert UTA 2744 235 3.08 16 Anthony Davis NOP 2708 135 1.79
    7 Richaun Holmes PHI 1193 92 2.78 17 Andre Drummond DET 2409 118 1.76
    8 Giannis Antetokounmpo MIL 2845 194 2.45 18 Steven Adams OKC 2389 112 1.69
    9 Hassan Whiteside MIA 2513 163 2.34 19 Aaron Gordon ORL 2298 99 1.55
    10 Mason Plumlee DEN 2147 132 2.21 20 Karl-Anthony Towns MIN 3030 130 1.54
    Dunks Per 36 Min

    Source: NBA.com/stats

    Still unconvinced? Check out all 121 of JaVale's Dunks below.

    Points in the Paint

    When your per-36 adjusted dunks are this far ahead of the competition, it’s not too much of a surprise that his adjusted points in the paint would reflect the same trend. Playing on a team that dished it at historic numbers this year and constantly looks for the 3rd pass doesn’t hurt either. Add that to the absurd gravitational pull of Golden State’s guards on the 3 point line and you have yourself a big-man PITP feeding frenzy on a nightly basis, something JaVale has happily taken advantage of this season. JaVale’s 20.00 PITP per-36 is the highest since Shaquille O’Neal’s 20.00 in 2001-2002. In fact, in the past 20 years no other player even eclipsed 15.5 PITP per-36, Shaq pulling off the feat for a dominant 11 straight years between 1996-2007.

    RANK PLAYER TEAM PITP MIN PITP / 36MIN RANK PLAYER TEAM PITP MIN PITP / 36MIN
    1 JaVale McGee GSW 410 739 20.00 11 DeMarcus Cousins NOP 848 2465 12.38
    2 Clint Capela HOU 720 1551 16.71 12 Dwight Howard ATL 748 2199 12.25
    3 Enes Kanter OKC 704 1533 16.53 13 Brook Lopez BKN 732 2222 11.86
    4 Hassan Whiteside MIA 970 2513 13.90 14 DeAndre Jordan LAC 824 2570 11.54
    4 Karl-Anthony Towns MIN 1154 3030 13.71 15 Steven Adams OKC 742 2389 11.18
    6 LeBron James CLE 1032 2794 13.30 16 Rudy Gobert UTA 822 2744 10.78
    7 Giannis Antetokounmpo MIL 1044 2845 13.21 17 Russell Westbrook OKC 816 2802 10.48
    8 Andre Drummond DET 882 2409 13.18 18 DeMar DeRozan TOR 760 2620 10.44
    9 Anthony Davis NOP 970 2708 12.90 19 Isaiah Thomas BOS 726 2569 10.17
    10 Nikola Jokic DEN 730 2038 12.89 20 John Wall WAS 774 2835 9.83
    Points in the Paint Per 36 Min

    Source: NBA.com/stats

    Offensive Rating

    Individual offensive rating is a tough stat to separate from the play of the team as a whole, but ranking #1 even when the rest of your team is in the top 10 still tells an impressive story. Not only does JaVale fit into the offensive juggernaut that is the Golden State Warriors, he makes them the most potent version of themselves when he’s on the floor.

    5-Man LineupOFFRTGDEFRTGNETRTG
    GSW Big 4 + JaVale124.492.232.1

    The Warriors lineup with McGee/Curry/Thompson/Durant/Green has the highest net rating of any five-man combo in the NBA this season with a minimum of 100 min played.

    RANK PLAYER TEAM OFFRTG RANK PLAYER TEAM OFFRTG
    1 JaVale McGee GSW 121.4 11 Nikola Jokic DEN 114.9
    2 Stephen Curry GSW 118.1 12 LeBron James CLE 114.9
    3 Pierre Jackson DAL 117.9 13 JJ Redick LAC 114.6
    4 Kevin Durant GSW 117.2 14 Jordan Farmar SAC 114.5
    5 Chris Paul LAC 116.2 15 Andre Iguodala GSW 114.3
    6 Zaza Pachulia GSW 115.8 16 Kyrie Irving CLE 114.2
    7 Klay Thompson GSW 115.6 17 Ryan Anderson HOU 113.8
    8 Draymond Green GSW 115.2 18 Clint Capela HOU 113.7
    9 Blake Griffin LAC 115.2 19 Isaiah Thomas BOS 113.6
    10 Gary Harris DEN 115.0 20 James Harden HOU 113.6
    Offensive Rating

    Source: NBA.com/stats

    PLUS/MINUS

    Plus/Minus further illustrates how the already outstanding Warriors are even better when JaVale is on the floor. It’s not surprising that the team with the NBA’s best record, #1 Offensive Efficiency and #2 Defensive Efficiency would dominate the plus/minus category.

    PLAYER TEAM +/- MIN PLUSMINUS / 36min RANK PLAYER TEAM +/- MIN PLUSMINUS / 36min
    1 JaVale McGee GSW 312 739 15.20 11 Blake Griffin LAC 440 2076 7.63
    2 Stephen Curry GSW 1015 2638 13.85 11 Ryan Anderson HOU 407 2116 6.92
    3 Kevin Durant GSW 711 2070 12.37 13 DeAndre Jordan LAC 459 2570 6.43
    4 Draymond Green GSW 820 2471 11.95 14 Kawhi Leonard SAS 436 2474 6.34
    5 Zaza Pachulia GSW 418 1268 11.87 15 LeBron James CLE 483 2794 6.22
    6 Klay Thompson GSW 801 2649 10.89 16 Patrick Beverley HOU 353 2058 6.17
    7 Chris Paul LAC 577 1921 10.81 17 Kyle Lowry TOR 358 2244 5.74
    8 Andre Iguodala GSW 527 1998 9.50 18 Rudy Gobert UTA 436 2744 5.72
    9 Patty Mills SAS 410 1754 8.42 19 Jae Crowder BOS 349 2335 5.38
    10 JJ Redick LAC 470 2198 7.70 20 James Harden HOU 425 2947 5.19
    Plus/Minus Per 36 Min

    Source: NBA.com/stats

    SECOND CHANCE POINTS

    RANK PLAYER TEAM MIN 2ND PTS 2ND PTS / 36MIN RANK PLAYER TEAM MIN 2ND PTS 2ND PTS / 36MIN
    1 Enes Kanter OKC 1533 260 6.11 11 LaMarcus Aldridge SAS 2335 247 3.81
    2 Hassan Whiteside MIA 2513 375 5.37 12 Russell Westbrook OKC 2802 292 3.75
    3 Andre Drummond DET 2409 355 5.31 13 Anthony Davis NOP 2708 282 3.75
    4 Dwight Howard ATL 2199 315 5.16 14 DeAndre Jordan LAC 2570 266 3.73
    5 Zach Randolph MEM 1786 255 5.14 14 Kevin Love CLE 1885 195 3.72
    6 JaVale McGee GSW 739 96 4.68 16 Robin Lopez CHI 2271 225 3.57
    7 Karl-Anthony Towns MIN 3030 386 4.59 16 DeMarcus Cousins NOP 2465 225 3.29
    8 Nikola Jokic DEN 2038 246 4.35 18 Steven Adams OKC 2389 210 3.17
    9 Jonas Valanciunas TOR 2066 240 4.18 19 Carmelo Anthony NYK 2538 210 2.98
    10 Rudy Gobert UTA 2744 301 3.95 20 Jimmy Butler CHI 2809 198 2.54
    2nd Chance Points Per 36 Min

    Source: NBA.com/stats

    Defense & Shot Blocking

    Blocks

    JaVale’s class-A airspace around the hoop doesn’t just exist on the offensive end. With premier ball stoppers like Draymond Green, Klay Thompson, and Andre Iguadala hounding the opposing offensive, JaVale consistently waits a step away to thunderously deny all shot attempts. With a 2017 block highlight reel almost as long as his dunk tape, JaVale has been nothing but impregnable around the rim, an defensive factor Golden State thought it would be sorely lacking with the departures of Andrew Bogut and Festus Ezeli. In fact pre-season this was the chink in the armor that many thought could fell the reigning western conference champions. JaVale has done more than his part to change that tune.

    RANK PLAYER TEAM MINUTES BLOCKS BLOCKS / 36MIN RANK PLAYER TEAM MINUTES BLOCKS BLOCKS / 36MIN
    1 JaVale McGee GSW 739 67 3.26 11 DeAndre Jordan LAC 2570 134 1.88
    2 Kyle O'Quinn NYK 1229 104 3.05 12 Robin Lopez CHI 2271 117 1.85
    3 Rudy Gobert UTA 2744 214 2.81 13 Serge Ibaka TOR 2422 124 1.84
    4 Myles Turner IND 2541 172 2.44 14 Kevin Durant GSW 2070 99 1.72
    5 Hassan Whiteside MIA 2513 161 2.31 15 Draymond Green GSW 2471 106 1.54
    6 Alex Len PHX 1560 98 2.26 16 Mason Plumlee DEN 2147 92 1.54
    7 Anthony Davis NOP 2708 167 2.22 17 Dwight Howard ATL 2199 92 1.51
    8 Kristaps Porzingis NYK 2164 129 2.15 18 Marc Gasol MEM 2531 99 1.41
    9 Brook Lopez BKN 2222 124 2.01 19 DeMarcus Cousins NOP 2465 93 1.36
    10 Giannis Antetokounmpo MIL 2845 151 1.91 20 Gorgui Dieng MIN 2653 95 1.29
    Blocks Per 36 Min

    Source: NBA.com/stats

    Watching segments like this one, it's not hard to see how JaVale's impact on both sides of the ball coupled with Golden State's long-ball wizardry leads to league leading +/- statistics.

    Defensive Win Shares

    Win Shares is a player statistic which attempts to divvy up credit for team success to the individuals on the team.

    RANK PLAYER TEAM MIN DEF WS DEF WS / 36MIN RANK PLAYER TEAM MIN DEF WS DEF WS / 36MIN
    1 Draymond Green GSW 2471 4.7 0.0685 11 JaVale McGee GSW 739 1.1 0.0536
    2 Patty Mills SAS 1754 3.3 0.0677 12 Anthony Davis NOP 2708 4 0.0532
    3 Stephen Curry GSW 2638 4.6 0.0628 13 Jrue Holiday NOP 2190 3.2 0.0526
    4 Rudy Gobert UTA 2744 4.7 0.0617 14 Paul Millsap ATL 2343 3.4 0.0522
    5 Andre Iguodala GSW 1998 3.4 0.0613 15 LaMarcus Aldridge SAS 2335 3.3 0.0509
    6 Klay Thompson GSW 2649 4.5 0.0612 16 Solomon Hill NOP 2374 3.3 0.0500
    7 Kevin Durant GSW 2070 3.5 0.0609 17 Andre Roberson OKC 2376 3.3 0.0500
    8 James Johnson MIA 2085 3.2 0.0553 18 Jimmy Butler CHI 2809 3.9 0.0500
    9 Victor Oladipo OKC 2222 3.4 0.0551 19 DeAndre Jordan LAC 2570 3.5 0.0490
    10 Gordon Hayward UTA 2516 3.8 0.0544 20 Kawhi Leonard SAS 2474 3.2 0.0466
    Defensive Win Shares Per 36 Min

    Source: NBA.com/stats

    Rebounding

    JaVale’s defensive rebound statistics are impressive for a team that defaults it’s rebounds to pace pushing forwards like Green, Iguadala, and Durant; similar to why we don’t see Houston or Oklahoma City bigs on this list. What really stands out is his adjusted OREB per-36, which would sit at #2 in the league at 4.87 if he played starters minutes. Many of the statistics we discussed above are due to offensive rebounding. JaVale doesn’t come into the game expecting to be integral in the GS offensive aside from lobs to the rim, instead taking advantage of the top perimeter shooting offensive in the league and the way it pulls opposing defenders to the 3pt line. The Warriors hit quite a lot of their long balls, but when they don’t, JaVale has been there to pull down boards and put them back in with authority. This consistent combination of dominant physical athleticism and mental awareness is a JaVale the league hasn’t seen before.

    RANK PLAYER TEAM MIN REB / 36MIN OREB / 36MIN DREB / 36MIN RANK PLAYER TEAM MIN REB / 36MIN OREB / 36MIN DREB / 36MIN
    1 Andre Drummond DET 2409 16.66 5.16 11.51 11 Marcin Gortat WAS 2555 11.96 3.35 8.61
    2 DeAndre Jordan LAC 2570 15.60 4.17 11.43 12 JaVale McGee GSW 739 11.89 4.87 7.01
    3 Hassan Whiteside MIA 2513 15.59 4.20 11.39 13 Anthony Davis NOP 2708 11.75 2.29 9.47
    4 Dwight Howard ATL 2199 15.39 4.85 10.54 14 DeMarcus Cousins NOP 2465 11.60 2.22 9.38
    5 Rudy Gobert UTA 2744 13.58 4.12 9.46 15 Russell Westbrook OKC 2802 11.10 1.76 9.34
    6 Jonas Valanciunas TOR 2066 13.23 3.94 9.29 16 Tristan Thompson CLE 2336 11.02 4.41 6.61
    7 Nikola Vucevic ORL 2163 12.97 2.93 10.04 17 Julius Randle LAL 2132 10.74 2.53 8.21
    8 Kevin Love CLE 1885 12.72 2.83 9.89 18 Giannis Antetokounmpo MIL 2845 8.86 1.80 7.06
    9 Nikola Jokic DEN 2038 12.68 3.74 8.94 19 Gorgui Dieng MIN 2653 8.78 2.55 6.23
    10 Karl-Anthony Towns MIN 3030 11.96 3.52 8.45 20 LeBron James CLE 2794 8.23 1.25 6.98
    Rebounding Per 36 Min

    Source: NBA.com/stats

    Have the Warriors turned JaVale McGee into an elite big man?
    To fully answer yes we would need to see consistent starting minutes, something that most likely won’t happen with the current team structure. Did they Warriors solve their supposed rim protection issues and in the process get an offensive juggernaut of a seven footer on a minimum contract? Without a doubt. Stats adjusted for minutes played will always lead to hypothetical conclusions when extrapolated, and being surrounding by a team full of generational talent like Warriors certainly makes everyone look better. What I can say is that Golden State has gotten above and beyond what it expected out of it’s veteran center, a player who nightly has a very tangible positive impact on a team surging towards it’s second title in three years. Through the first round of the playoffs JaVale show's no sign of slowing down in his quest to add even more on to what has been a career year.

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 2
  • [Notes]
    Underlying Structure Choices

    UNDER CONSTRUCTION

    1

    [Notes]
    Underlying Structure Choices

    I want to take a second to talk about some of the choices I have made in regards to the structure of this site moving forward. These days there are endless paths to go down depending on what kind of site or blog you are trying to set up. Packages such as Pelican and Hyde make it extreamly easy to write, publish, and push content in a simple, effective, and reproductable way and platforms like Wordpress and GitHub Pages make publishing static content a breeze for beginners and experts alike.

    Normally, since I plan to consitantly produce material I would go with one of these templated options, but the point of this site is much more than simply a blog. First off, there are portions of this site that will not be static and that pretty much rules out any of the former options. Second, a large part of this site is about design, not just function, and I want complete control over every line of code and pixel, even if that makes each post creation a little more tedious and drawn-out.

    ........

    Eventually color by article topics?

    You must be logged in to comment!

    admin
    2017-05-02 14:14:53

    Comment Post Test....

    ARTICLE 1
  • [The Beginning]
    Mercurial Analytics:
    Iteration Zero

    This is the first iteration of Mercurial Analytics, a sort-of public sandbox and playground of ideas, work, and musings belonging to Cole Page. Right now a fairly abstract curvey idea, it may eventually coelesce into something stable and with purpose, or it may not...

    0

    [The Beginning]
    Mercurial Analytics:
    Iteration Zero

    This is the first iteration of Mercurial Analytics, a public sandbox for me: .

    Mercurial describes a person, or in this case the work produced by a person (me), that is subject to sudden or unpredictable changes of mood or mind. Sometimes a postive, sometimes not so much, this is for better or worse how my mind works. The work here will be anything but linear, and 100% dictated by my fairly capricious daily interests.

    This first iteration of Mercurial Analytics will be an online platform for me to test and play with various skills, languages, and tools as I learn them. It will be a blog of sorts, publishing data driven articles on an array of topics, but mostly the plan is for it to be an expansive digital whiteboard that never gets erased and houses all manner of thoughts, from complete and thought out to utterly uncoherently rambling.

    The primary responsiblity of data scientists, progammers, statisticians, and academics is to learn. Tools, languages, methods and skills are constantly changing and evolving and each iteration and advancement adds and alters how you approach problem solving.

    Throughout this iteration no work will truly be complete. Many pieces may seem broken, articles unfinished, or functionality pointless as new skills are learned, added, and incorporated into the greater amalgam. I am (at least presently) a fairly atrocious writer, so don't expect any mind-blowing prose. I would love to get better and eventually I hope to, but that is not the primary, secondary, or even tertiery purpose of this place.

    Don’t expect anything to work how you think it should, it won’t, but the data being used as well as the accompanying insights and opnions will be real and at least semi thought out. My opinions will always be right, and yet quite often very wrong and that's okay because this is my whiteboard and I hold all the markers. There are plenty of spaces for collaborative inclusive discussion out there, this is not one of them. This is not a public park, this is my backyard so hang out, observe, and enjoy or go ahead and get off my lawn!

    The next iteration may be something wholey different, I have some decently vague, possibly interesting, ideas for iteration two's direction, but for now:

    This is simply a digital notebook, publication, canvas, and autodidact playground.

    Enjoy. Or ya know, don't.... totally up to you



    P.S. If for some insane reason you want to pay me to do anything give me a shout with your ill-advised proposition: coleatmercurialanalyticsdotcom
    ARTICLE 0
"TO CONDENSE FACT FROM THE VAPOR OF NUANCE"