A planet of blogs from our members...

Tim HopperSampling from a Hierarchical Dirichlet Process

This may be more readable on NBViewer.

In [137]:
%matplotlib inline

As we saw earlier the Dirichlet process describes the distribution of a random probability distribution. The Dirichlet process takes two parameters: a base distribution $H_0$ and a dispersion parameter $\alpha$. A sample from the Dirichlet process is itself a probability distribution that looks like $H_0$. On average, the larger $\alpha$ is, the closer a sample from $\text{DP}(\alpha H_0)$ will be to $H_0$.

Suppose we're feeling masochistic and want to input a distribution sampled from a Dirichlet process as base distribution to a new Dirichlet process. (It will turn out that there are good reasons for this!) Conceptually this makes sense. But can we construct such a thing in practice? Said another way, can we build a sampler that will draw samples from a probability distribution drawn from these nested Dirichlet processes? We might initially try construct a sample (a probability distribution) from the first Dirichlet process before feeding it into the second.

But recall that fully constructing a sample (a probability distribution!) from a Dirichlet process would require drawing a countably infinite number of samples from $H_0$ and from the beta distribution to generate the weights. This would take forever, even with Hadoop!

Dan Roy, et al helpfully described a technique of using stochastic memoization to construct a distribution sampled from a Dirichlet process in a just-in-time manner. This process provides us with the equivalent of the Scipy rvs method for the sampled distribution. Stochastic memoization is equivalent to the Chinese restaurant process: sometimes you get seated an an occupied table (i.e. sometimes you're given a sample you've seen before) and sometimes you're put at a new table (given a unique sample).

Here is our memoization class again:

In [162]:
from numpy.random import choice 
from scipy.stats import beta

class DirichletProcessSample():
    def __init__(self, base_measure, alpha):
        self.base_measure = base_measure
        self.alpha = alpha
        
        self.cache = []
        self.weights = []
        self.total_stick_used = 0.

    def __call__(self):
        remaining = 1.0 - self.total_stick_used
        i = DirichletProcessSample.roll_die(self.weights + [remaining])
        if i is not None and i < len(self.weights) :
            return self.cache[i]
        else:
            stick_piece = beta(1, self.alpha).rvs() * remaining
            self.total_stick_used += stick_piece
            self.weights.append(stick_piece)
            new_value = self.base_measure()
            self.cache.append(new_value)
            return new_value
        
    @staticmethod 
    def roll_die(weights):
        if weights:
            return choice(range(len(weights)), p=weights)
        else:
            return None

Let's illustrate again with a standard normal base measure. We can construct a function base_measure that generates samples from it.

In [95]:
from scipy.stats import norm

base_measure = lambda: norm().rvs() 

Because the normal distribution has continuous support, we can generate samples from it forever and we will never see the same sample twice (in theory). We can illustrate this by drawing from the distribution ten thousand times and seeing that we get ten thousand unique values.

In [163]:
from pandas import Series

ndraws = 10000
print "Number of unique samples after {} draws:".format(ndraws), 
draws = Series([base_measure() for _ in range(ndraws)])
print draws.unique().size
Number of unique samples after 10000 draws: 10000

However, when we feed the base measure through the stochastic memoization procedure and then sample, we get many duplicate samples. The number of unique samples goes down as $\alpha$ increases.

In [164]:
norm_dp = DirichletProcessSample(base_measure, alpha=100)

print "Number of unique samples after {} draws:".format(ndraws), 
dp_draws = Series([norm_dp() for _ in range(ndraws)])
print dp_draws.unique().size
Number of unique samples after 10000 draws: 446

At this point, we have a function dp_draws that returns samples from a probability distribution (specifically, a probability distribution sampled from $\text{DP}(\alpha H_0)$). We can use dp_draws as a base distribution for another Dirichlet process!

In [155]:
norm_hdp = DirichletProcessSample(norm_dp, alpha=10)

How do we interpret this? norm_dp is a sampler from a probability distribution that looks like the standard normal distribution. norm_hdp is a sampler from a probability distribution that "looks like" the distribution norm_dp samples from.

Here is a histogram of samples drawn from norm_dp, our first sampler.

In [152]:
import matplotlib.pyplot as plt
pd.Series(norm_dp() for _ in range(10000)).hist()
_=plt.title("Histogram of Samples from norm_dp")

And here is a histogram for samples drawn from norm_hdp, our second sampler.

In [154]:
pd.Series(norm_hdp() for _ in range(10000)).hist()
_=plt.title("Histogram of Samples from norm_hdp")

The second plot doesn't look very much like the first! The level to which a sample from a Dirichlet process approximates the base distribution is a function of the dispersion parameter $\alpha$. Because I set $\alpha=10$ (which is relatively small), the approximation is fairly course. In terms of memoization, a small $\alpha$ value means the stochastic memoizer will more frequently reuse values already seen instead of drawing new ones.

This nesting procedure, where a sample from one Dirichlet process is fed into another Dirichlet process as a base distribution, is more than just a curiousity. It is known as a Hierarchical Dirichlet Process, and it plays an important role in the study of Bayesian Nonparametrics (more on this in a future post).

Without the stochastic memoization framework, constructing a sampler for a hierarchical Dirichlet process is a daunting task. We want to be able to draw samples from a distribution drawn from the second level Dirichlet process. However, to be able to do that, we need to be able to draw samples from a distribution sampled from a base distribution of the second-level Dirichlet process: this base distribution is a distribution drawn from the first-level Dirichlet process.

Though it appeared that we would need to be able to fully construct the first level sample (by drawing a countably infinite number of samples from the first-level base distribution). However, stochastic memoization allows us to only construct the first distribution just-in-time as it is needed at the second-level.

We can define a Python class to encapsulate the Hierarchical Dirichlet Process as a base class of the Dirichlet process.

In [165]:
class HierarchicalDirichletProcessSample(DirichletProcessSample):
    def __init__(self, base_measure, alpha1, alpha2):
        first_level_dp = DirichletProcessSample(base_measure, alpha1)
        self.second_level_dp = DirichletProcessSample(first_level_dp, alpha2)

    def __call__(self):
        return self.second_level_dp()

Since the Hierarchical DP is a Dirichlet Process inside of Dirichlet process, we must provide it with both a first and second level $\alpha$ value.

In [167]:
norm_hdp = HierarchicalDirichletProcessSample(base_measure, alpha1=10, alpha2=20)

We can sample directly from the probability distribution drawn from the Hierarchical Dirichlet Process.

In [170]:
pd.Series(norm_hdp() for _ in range(10000)).hist()
_=plt.title("Histogram of samples from distribution drawn from Hierarchical DP")

norm_hdp is not equivalent to the Hierarchical Dirichlet Process; it samples from a single distribution sampled from this HDP. Each time we instantiate the norm_hdp variable, we are getting a sampler for a unique distribution. Below we sample five times and get five different distributions.

In [180]:
for i in range(5):
    norm_hdp = HierarchicalDirichletProcessSample(base_measure, alpha1=10, alpha2=10)
    _=pd.Series(norm_hdp() for _ in range(100)).hist()
    _=plt.title("Histogram of samples from distribution drawn from Hierarchical DP")
    _=plt.figure()
<matplotlib.figure.Figure at 0x112a2da50>

In a later post, I will discuss how these tools are applied in the realm of Bayesian nonparametrics.

Tim HopperHigh Quality Code at Quora

I love this new post on Quora's engineering blog. The post states "high code quality is the long-term boost to development speed" and goes on to explain how they go about accomplishing this.

I've inherited large code bases at each of my jobs out of grad school, and I've spent a lot of thinking about this question. At least on the surface, I love the solutions Quora has in place for ensuring quality code: thoughtful code review, careful testing, style guidelines, static checking, and intentional code cleanup.

Og MacielBooks - July 2015

Books - July 2015

This July 2015 I travelled to the Red Hat office in Brno, Czech Republic to spend some time with my teammates there, and I managed to get a lot of reading done between long plane rides and being jet lagged for many nights :) So I finally managed to finish up some of the books that had been lingering on my ToDo list and even managed to finally read a few of the books that together make up the Chronicles of Narnia, since I had never read them as a kid.

Read

Out of all the books I read this month, I feel that All Quiet on the Western Front and The October Country were the ones I enjoyed reading the most, closely followed by Cryptonomicon, which took me a while to get through. The other books, with the exception of The Memoirs of Sherlock Holmes, helped me pass the time when I only wanted to be entertained.

All Quiet on the Western Front takes the prize for being one of the best books I have ever read! I felt that the way WWI was presented through the eyes of the main character was a great way to represent all the pain, angst and suffering that all sides of conflict went through, without catering for any particular side or having an agenda. Erich Maria Remarque's style had me some times breathless, some times with a knot on the pit of my stomach I as 'endured' the many life changing events that took place in the book. Is this an action-packed book about WWI? Will it read like a thriller? In my opinion, even though there are many chapters with gory details about killings and battles, the answer is a very bland 'maybe'. I think that the real 'star' of this book is its philosophical view of the war and how the main characters, all around 19-20 years of age, learn to deal with its life lasting effects.

Now, I have been a huge fan of Ray Bradbury for a while now, and when I got The October Country for my birthday last month, I just knew that it would be time well spent reading it. For those of you who are more acquainted his science fiction works, this book will surprise you as it shows you a bit of his 'darker' side. All of the short stories included in this collection deal with death, mysterious apparitions, inexplicable endings and are sure to spook you a little bit.

Cryptonomicon was at times slow, some other times funny and, especially toward the end, a very entertaining book. Weighing in at a hefty 1000 pages (depending on the edition you have, plus/minus 50 odd pages), this book covers two different periods in the lives of a number of different characters, past (around WWII) and present, all different threads eventually leading to a great finale. Alternating between past and present, the story takes us to the early days of how cryptology was 'officially invented' and used during the war, and how many of the events that took place back then were affecting the lives of some of the direct descendants of the main characters in our present day. As you go through the back and forth you start to gather bits and pieces of information that eventually connects all the dots of an interesting puzzle. It definitely requires a long term commitment to go though it, but it was enjoyable and, as I mention before, it made me laugh at many places.

Caktus GroupUsing Unsaved Related Models for Sample Data in Django 1.8

Note: In between the time I originally wrote this post and it getting published, a ticket and pull request were opened in Django to remove allow_unsaved_instance_assignment and move validation to the model save() method, which makes much more sense anyways. It's likely this will even be backported to Django 1.8.4. So, if you're using a version of Django that doesn't require this, hopefully you'll never stumble across this post in the first place! If this is still an issue for you, here's the original post:

In versions of Django prior to 1.8, it was easy to construct "sample" model data by putting together a collection of related model objects, even if none of those objects was saved to the database. Django 1.8 - 1.8.3 adds a restriction that prevents this behavior. Errors such as this are generally a sign that you're encountering this issue:

ValueError: Cannot assign "...": "MyRelatedModel" instance isn't saved in the database.

The justification for this is that, previously, unsaved foreign keys were silently lost if they were not saved to the database. Django 1.8 does provide a backwards compatibility flag to allow working around the issue. The workaround, per the Django documentation, is to create a new ForeignKey field that removes this restriction, like so:

class UnsavedForeignKey(models.ForeignKey):
    # A ForeignKey which can point to an unsaved object
    allow_unsaved_instance_assignment = True

class Book(models.Model):
    author = UnsavedForeignKey(Author)

This may be undesirable, however, because this approach means you lose all protection for all uses of this foreign key, even if you want Django to ensure foreign key values have been saved before being assigned in some cases.

There is a middle ground, not immediately obvious, that involves changing this attribute temporarily during the assignment of an unsaved value and then immediately changing it back. This can be accomplished by writing a context manager to change the attribute, for example:

import contextlib

@contextlib.contextmanager
def allow_unsaved(model, field):
    model_field = model._meta.get_field(field)
    saved = model_field.allow_unsaved_instance_assignment
    model_field.allow_unsaved_instance_assignment = True
    yield
    model_field.allow_unsaved_instance_assignment = saved

To use this decorator, surround any assignment of an unsaved foreign key value with the context manager as follows:

with allow_unsaved(MyModel, 'my_fk_field'):
    my_obj.my_fk_field = unsaved_instance

The specifics of how you access the field to pass into the context manager are important; any other way will likely generate the following error:

RelatedObjectDoesNotExist: MyModel has no instance.

While strictly speaking this approach is not thread safe, it should work for any process-based worker model (such as the default "sync" worker in Gunicorn).

This took a few iterations to figure out, so hopefully it will (still) prove useful to someone else!

Tim Hopper10x Engineering

Tim HopperNotes on the Dirichlet Distribution and Dirichlet Process

In [3]:
%matplotlib inline

Note: I wrote this post in an IPython notebook. It might be rendered better on NBViewer.

Dirichlet Distribution

The symmetric Dirichlet distribution (DD) can be considered a distribution of distributions. Each sample from the DD is a categorial distribution over $K$ categories. It is parameterized $G_0$, a distribution over $K$ categories and $\alpha$, a scale factor.

The expected value of the DD is $G_0$. The variance of the DD is a function of the scale factor. When $\alpha$ is large, samples from $DD(\alpha\cdot G_0)$ will be very close to $G_0$. When $\alpha$ is small, samples will vary more widely.

We demonstrate below by setting $G_0=[.2, .2, .6]$ and varying $\alpha$ from 0.1 to 1000. In each case, the mean of the samples is roughly $G_0$, but the standard deviation is decreases as $\alpha$ increases.

In [10]:
import numpy as np
from scipy.stats import dirichlet
np.set_printoptions(precision=2)

def stats(scale_factor, G0=[.2, .2, .6], N=10000):
    samples = dirichlet(alpha = scale_factor * np.array(G0)).rvs(N)
    print "                          alpha:", scale_factor
    print "              element-wise mean:", samples.mean(axis=0)
    print "element-wise standard deviation:", samples.std(axis=0)
    print
    
for scale in [0.1, 1, 10, 100, 1000]:
    stats(scale)
                          alpha: 0.1
              element-wise mean: [ 0.2  0.2  0.6]
element-wise standard deviation: [ 0.38  0.38  0.47]

                          alpha: 1
              element-wise mean: [ 0.2  0.2  0.6]
element-wise standard deviation: [ 0.28  0.28  0.35]

                          alpha: 10
              element-wise mean: [ 0.2  0.2  0.6]
element-wise standard deviation: [ 0.12  0.12  0.15]

                          alpha: 100
              element-wise mean: [ 0.2  0.2  0.6]
element-wise standard deviation: [ 0.04  0.04  0.05]

                          alpha: 1000
              element-wise mean: [ 0.2  0.2  0.6]
element-wise standard deviation: [ 0.01  0.01  0.02]

Dirichlet Process

The Dirichlet Process can be considered a way to generalize the Dirichlet distribution. While the Dirichlet distribution is parameterized by a discrete distribution $G_0$ and generates samples that are similar discrete distributions, the Dirichlet process is parameterized by a generic distribution $H_0$ and generates samples which are distributions similar to $H_0$. The Dirichlet process also has a parameter $\alpha$ that determines how similar how widely samples will vary from $H_0$.

We can construct a sample $H$ (recall that $H$ is a probability distribution) from a Dirichlet process $\text{DP}(\alpha H_0)$ by drawing a countably infinite number of samples $\theta_k$ from $H_0$) and setting:

$$H=\sum_{k=1}^\infty \pi_k \cdot\delta(x-\theta_k)$$

where the $\pi_k$ are carefully chosen weights (more later) that sum to 1. ($\delta$ is the Dirac delta function.)

$H$, a sample from $DP(\alpha H_0)$, is a probability distribution that looks similar to $H_0$ (also a distribution). In particular, $H$ is a discrete distribution that takes the value $\theta_k$ with probability $\pi_k$. This sampled distribution $H$ is a discrete distribution even if $H_0$ has continuous support; the support of $H$ is a countably infinite subset of the support $H_0$.

The weights ($\pi_k$ values) of a Dirichlet process sample related the Dirichlet process back to the Dirichlet distribution.

Gregor Heinrich writes:

The defining property of the DP is that its samples have weights $\pi_k$ and locations $\theta_k$ distributed in such a way that when partitioning $S(H)$ into finitely many arbitrary disjoint subsets $S_1, \ldots, S_j$ $J<\infty$, the sums of the weights $\pi_k$ in each of these $J$ subsets are distributed according to a Dirichlet distribution that is parameterized by $\alpha$ and a discrete base distribution (like $G_0$) whose weights are equal to the integrals of the base distribution $H_0$ over the subsets $S_n$.

As an example, Heinrich imagines a DP with a standard normal base measure $H_0\sim \mathcal{N}(0,1)$. Let $H$ be a sample from $DP(H_0)$ and partition the real line (the support of a normal distribution) as $S_1=(-\infty, -1]$, $S_2=(-1, 1]$, and $S_3=(1, \infty]$ then

$$H(S_1),H(S_2), H(S_3) \sim \text{Dir}\left(\alpha\,\text{erf}(-1), \alpha\,(\text{erf}(1) - \text{erf}(-1)), \alpha\,(1-\text{erf}(1))\right)$$

where $H(S_n)$ be the sum of the $\pi_k$ values whose $\theta_k$ lie in $S_n$.

These $S_n$ subsets are chosen for convenience, however similar results would hold for any choice of $S_n$. For any sample from a Dirichlet process, we can construct a sample from a Dirichlet distribution by partitioning the support of the sample into a finite number of bins.

There are several equivalent ways to choose the $\pi_k$ so that this property is satisfied: the Chinese restaurant process, the stick-breaking process, and the Pólya urn scheme.

To generate $\left\{\pi_k\right\}$ according to a stick-breaking process we define $\beta_k$ to be a sample from $\text{Beta}(1,\alpha)$. $\pi_1$ is equal to $\beta_1$. Successive values are defined recursively as

$$\pi_k=\beta_k \prod_{j=1}^{k-1}(1-\beta_j).$$

Thus, if we want to draw a sample from a Dirichlet distribution, we could, in theory, sample an infinite number of $\theta_k$ values from the base distribution $H_0$, an infinite number of $\beta_k$ values from the Beta distribution. Of course, sampling an infinite number of values is easier in theory than in practice.

However, by noting that the $\pi_k$ values are positive values summing to 1, we note that, in expectation, they must get increasingly small as $k\rightarrow\infty$. Thus, we can reasonably approximate a sample $H\sim DP(\alpha H_0)$ by drawing enough samples such that $\sum_{k=1}^K \pi_k\approx 1$.

We use this method below to draw approximate samples from several Dirichlet processes with a standard normal ($\mathcal{N}(0,1)$) base distribution but varying $\alpha$ values.

Recall that a single sample from a Dirichlet process is a probability distribution over a countably infinite subset of the support of the base measure.

The blue line is the PDF for a standard normal. The black lines represent the $\theta_k$ and $\pi_k$ values; $\theta_k$ is indicated by the position of the black line on the $x$-axis; $\pi_k$ is proportional to the height of each line.

We generate enough $\pi_k$ values are generated so their sum is greater than 0.99. When $\alpha$ is small, very few $\theta_k$'s will have corresponding $\pi_k$ values larger than $0.01$. However, as $\alpha$ grows large, the sample becomes a more accurate (though still discrete) approximation of $\mathcal{N}(0,1)$.

In [13]:
import matplotlib.pyplot as plt
from scipy.stats import beta, norm

def dirichlet_sample_approximation(base_measure, alpha, tol=0.01):
    betas = []
    pis = []
    betas.append(beta(1, alpha).rvs())
    pis.append(betas[0])
    while sum(pis) < (1.-tol):
        s = np.sum([np.log(1 - b) for b in betas])
        new_beta = beta(1, alpha).rvs() 
        betas.append(new_beta)
        pis.append(new_beta * np.exp(s))
    pis = np.array(pis)
    thetas = np.array([base_measure() for _ in pis])
    return pis, thetas

def plot_normal_dp_approximation(alpha):
    plt.figure()
    plt.title("Dirichlet Process Sample with N(0,1) Base Measure")
    plt.suptitle("alpha: %s" % alpha)
    pis, thetas = dirichlet_sample_approximation(lambda: norm().rvs(), alpha)
    pis = pis * (norm.pdf(0) / pis.max())
    plt.vlines(thetas, 0, pis, )
    X = np.linspace(-4,4,100)
    plt.plot(X, norm.pdf(X))

plot_normal_dp_approximation(.1)
plot_normal_dp_approximation(1)
plot_normal_dp_approximation(10)
plot_normal_dp_approximation(1000)

Often we want to draw samples from a distribution sampled from a Dirichlet process instead of from the Dirichlet process itself. Much of the literature on the topic unhelpful refers to this as sampling from a Dirichlet process.

Fortunately, we don't have to draw an infinite number of samples from the base distribution and stick breaking process to do this. Instead, we can draw these samples as they are needed.

Suppose, for example, we know a finite number of the $\theta_k$ and $\pi_k$ values for a sample $H\sim \text{Dir}(\alpha H_0)$. For example, we know

$$\pi_1=0.5,\; \pi_3=0.3,\; \theta_1=0.1,\; \theta_2=-0.5.$$

To sample from $H$, we can generate a uniform random $u$ number between 0 and 1. If $u$ is less than 0.5, our sample is $0.1$. If $0.5<=u<0.8$, our sample is $-0.5$. If $u>=0.8$, our sample (from $H$ will be a new sample $\theta_3$ from $H_0$. At the same time, we should also sample and store $\pi_3$. When we draw our next sample, we will again draw $u\sim\text{Uniform}(0,1)$ but will compare against $\pi_1, \pi_2$, AND $\pi_3$.

The class below will take a base distribution $H_0$ and $\alpha$ as arguments to its constructor. The class instance can then be called to generate samples from $H\sim \text{DP}(\alpha H_0)$.

In [20]:
from numpy.random import choice

class DirichletProcessSample():
    def __init__(self, base_measure, alpha):
        self.base_measure = base_measure
        self.alpha = alpha
        
        self.cache = []
        self.weights = []
        self.total_stick_used = 0.

    def __call__(self):
        remaining = 1.0 - self.total_stick_used
        i = DirichletProcessSample.roll_die(self.weights + [remaining])
        if i is not None and i < len(self.weights) :
            return self.cache[i]
        else:
            stick_piece = beta(1, self.alpha).rvs() * remaining
            self.total_stick_used += stick_piece
            self.weights.append(stick_piece)
            new_value = self.base_measure()
            self.cache.append(new_value)
            return new_value
        
    @staticmethod 
    def roll_die(weights):
        if weights:
            return choice(range(len(weights)), p=weights)
        else:
            return None

This Dirichlet process class could be called stochastic memoization. This idea was first articulated in somewhat abstruse terms by Daniel Roy, et al.

Below are histograms of 10000 samples drawn from samples drawn from Dirichlet processes with standard normal base distribution and varying $\alpha$ values.

In [22]:
import pandas as pd

base_measure = lambda: norm().rvs()
n_samples = 10000
samples = {}
for alpha in [1, 10, 100, 1000]:
    dirichlet_norm = DirichletProcessSample(base_measure=base_measure, alpha=alpha)
    samples["Alpha: %s" % alpha] = [dirichlet_norm() for _ in range(n_samples)]

_ = pd.DataFrame(samples).hist()

Note that these histograms look very similar to the corresponding plots of sampled distributions above. However, these histograms are plotting points sampled from a distribution sampled from a Dirichlet process while the plots above were showing approximate distributions samples from the Dirichlet process. Of course, as the number of samples from each $H$ grows large, we would expect the histogram to be a very good empirical approximation of $H$.

In a future post, I will look at how this DirichletProcessSample class can be used to draw samples from a hierarchical Dirichlet process.

In [ ]:
 

Tim HopperHandy One-off Webpages

I'm starting to love single-page informational websites. For example:

My website Should I Get a Phd? is in this same vein.

Publishing a site like this is very cheap with static hosting on AWS. I would love to see more of them created!

Caktus GroupPyCon 2015 Workshop Video: Building SMS Applications with Django

As proud sponsors of PyCon, we hosted a one and a half hour free workshop. We see the workshops as a wonderful opportunity to share some practical, hands-on experience in our area of expertise: building applications in Django. In addition, it’s a way to give back to the open source community.

This year, Technical Director Mark Lavin and Developers Caleb Smith and David Ray presented “Building SMS Applications with Django.” In the workshop, they taught the basics of SMS application development using Django and Django-based RapidSMS. Aside from covering the basic anatomy of an SMS-based application, as well as building SMS workflows and testing SMS applications, Mark, David, and Caleb were able to bring their practical experience with Caktus client projects to the table.

We’ve used SMS on behalf of international aid organizations and agencies like UNICEF as a cost-effective and pervasive method for conveying urgent information. We’ve built tools to help Libyans register to vote via SMS, deliver critical infant HIV/AIDs results in Zambia and Malawi, and alert humanitarian workers of danger in and around Syria.

Interested in SMS applications and Django? Don’t worry. If you missed the original workshop, we have good news: we recorded it. You can participate by watching the video above!

Tim HopperThinking at Work

Having worked from home for the last few years, I have a hard time understanding how people get anything done in open-floor plan offices. I would be overwhelmed and frustrated by the noise and commotion.

I assumed open-floor plans for software shops were a relatively new invention. However, I just started reading Peopleware: Productive Projects and Teams, first published in 1987, and discovered that the first third of the book rails against open floor plan offices. I particularly enjoyed this quote:

In my years at Bell Labs, we worked in two-person offices. They were spacious, quiet, and the phones could be diverted. I shared my office with Wendl Thomis, who went on to build a small empire as an electric toy maker. In those days, he was working on the Electronic Switching System fault dictionary. The dictionary scheme relied upon the notion of n-space proximity, a concept that was hairy enough to challenge even Wendel's powers of concentration. One afternoon, I was bent over a program listing while Wendl was staring into space, his feet propped up on his desk. Our boss came in and asked, "Wendl! What are you doing?" Wendl said, "I'm thinking." And the boss said, "Can't you do that at home?"

The difference between that Bell Labs environment and a typical modern-day office plan is that in those quiet offices, one at least had the option of thinking on the job. In most of the office space we encounter today, there is enough noise and interruption to make any serious thinking virtually impossible. More is the shame: Your people bring their brains with them every morning. They could put them to work for you at no additional cost if only there were a small measure of peace and quiet in the workplace.

Tim HopperTweets I'm Proud Of

Tim HopperNew Post Function for Bash

On of the things I don't like about using a static site generator is the friction required for creating a new post. I've often end up posting things to Twitter that I would prefer to be more permanent simply because of the ease of tweeting.

To that end, I created a quick Bash function to create a new post for me. Creating this post in my Pelican directory only requires typing

$ new-post "New Post Function for Bash"

Combined with Greg Reda's Travis CI trick, the friction in getting a new post online is greatly reduced.

Caktus GroupQ3 2015 ShipIt Day ReCap

Last Friday marked another ShipIt Day at Caktus, a chance for our employees to set aside client work for experimentation and personal development. It’s always a wonderful chance for our developers to test new boundaries, learn new skills and sometimes even build something entirely new in a single day.


NC Nwoko and Mark Lavin teamed up to develop a pizza calculator app. The app simply and efficiently calculates how much pizza any host or catering planner needs to order to feed a large group of people. We eat a lot of pizza at Caktus. Noticing deficiencies in other calculators on the internet, NC and Mark built something simple, clean, and (above all) well researched. In the future, they hope to add size mixing capabilities and as well as a function for calculating the necessary ratios to provide for certain dietary restrictions, such as vegan, vegetarian, or gluten-free eaters.

Jeff Bradberry and Scott Morningstar worked on getting Ansible functioning to replace SALT states in the Django project template and made a lot of progress. And Karen Tracey approached some recent test failures, importing solutions from the database, while Rebecca Muraya began the massive task of updating some of our client based projects to Python 3.

Hunter MacDermut continued building the game he started last ShipIt Day, an HTML5 game using the Phaser framework. He added logic and other game-like elements to make a travelable board with the goal of destroying opponents. He also added animated sprites, including animations for an attack, giving each character their own unique moves. The result was a lot of fun to watch!

Dmitriy Chukhin and Caleb Smith developed a YouTube listening queue using ReactJS, using JQuery for the data layer. They loved the tag functions inherent in ReactJS as well as the speed.

Victor Rocha wrote a new admin action that enables a user to export models as a CSV file. He even found time to open source his work.

Vinod Kurup spent his day fixing RapidSMS bugs, creating two new pull requests. You can find them here and here. Once reviewed, they will be incorporated in the next RapidSMS release.

Neil Ashton worked through three chapters of experiments from The Foundations of Statistics: a Simulation-based Approach using iPython Notebook. He subsequently fell in love with iPython Notebook. An interactive computational environment, the iPython notebook seems the perfect platform for Neil’s love of data visualization and interactive experimentation. The iPython notebook ultimately allows the user to combine code execution, rich text, mathematics, plots, and other rich media.

Ross Pike spent the day exploring Font Awesome, the open-source library for scalable vector icons. He also took several tutorials in Sketch, an application for designing websites, interfaces, icons, and pretty much anything else.

Tobias McNulty spent some time working on the next release of django-cache-machine, a 3rd party Django app that adds caching and automatic invalidation to your Django models on a per-model basis. This ShipIt Day he worked on adding Python 3 support (with help from Vinod) and added a feature to support invalidation of queries when new model instances are created.

Finally, inspired by the open data apps built by Code for Durham, Rebecca Conley used D3 to write data visualizations of data on North Carolina’s public schools. Eventually, she wants to test more complex bar graph visualizations as well as learn data visualization in D3 beyond the bar graph.

We had a number of people on vacation this ShipIt Day and several administrators and team members who couldn’t put away their typical workload this time around. But no matter; there is always the next ShipIt Day!

Astro Code SchoolVideo - Why go to code school?

In this video Astro Lead Instructor Caleb Smith answers the question, "Why go to code school?". A major point is the laser focus. Code schools allow you to learn precisely the skills needed to perform a highly technical job. Watch this video for other parts of his answer. This is the first in a series of question and answer videos. More answers to real questions you may have about going to code school are on the way.

Don't forget to subscribe to the Astro Code School YouTube channel.

Astro Code SchoolDjango Bootcamp in Baltimore at Betamore

On August 7, 2015 Caleb Smith will be teaching a Django Bootcamp in Baltimore, Maryland! This short class is from 9am to 3pm and will be held at Betamore, a coworking space, incubator and campus for technology and entrepreneurship. At the class students will learn how to build a simple Django app. This bootcamp is targeted at beginners but everyone is welcome. The prerequisites are a laptop with Python and pip installed. Sign up now and reserve your spot.

Big thanks to the super kind folks at Betamore for hosting this and working with us to get this class going.

Caktus GroupLAN Party at Caktus

This past weekend, our wonderful Technology Support Specialist Scott Morningstar hosted a Local Area Network (LAN) party at Caktus HQ. Held twice a year since 2008, the event allows geeks, gamers, and retro technology lovers to relive the nostalgia of multiplayer gaming in the early days of dial-up internet. In other words, everyone brings their own computer, and uses the LAN to play online games in the company of others. These parties are a lot of fun and add a more personal social element to the online gaming community.

This year, participants played Terraria, an action game with creative world-building elements, Artemis, a spaceship bridge simulator game, and Counter Strike: Global Offensive, a team-based modern warfare game. Not only was it wonderful to see our space filled with enthusiastic gamers, but it was doubly exciting that participants joined gameplay remotely from LAN party events in Boston, Massachusetts, Farmville, Virginia, and Minneapolis, Minnesota.

We love being able to support and host such a wide variety of technology-related events in our community meeting space! For information on other functions held in our downtown Durham headquarters, or in our Astro Code School space, be sure to check out the Events page on our website.

Og MacielBooks - June 2015

Books - June 2015

Those of you who know me know that I am a huge book reader and spend most of my free time reading several books at the same time. One could say that reading is one of my passions, and having wasted so many years after high school completely ignoring this passion (in exchange for spending most of my time trying to learn about Linux, get an education, a job and, let's be frank, chasing after girls), I decided that something had to be done about it, and starting around 2008 I 'forced' myself to dedicate at least one solid hour of reading for fun every day.

I find it funny to say that I had to force myself, but this statement is very much true. Being so used to spending all of my time sitting in front of a computer and getting flooded with information every single minute of the day (IRC, Twitter, Facebook, commit emails, RSS feeds, etc), I found it difficult to 'unplug' and spend time doing nothing but focusing on only one thing. I was so used to multitasking and being constantly bombarded with lots of information that sitting quietly and reading didn't feel very productive to me... sad but true.

Anyhow, after several 'agonizing' months of getting up from my desk and making a point of turning off my cel phone and finding a quiet place somewhere in the building (or at home during the weekends), I finally got into the habit of reading for pleasure. I actually looked forward to these reading periods (imagine that, huh?) and eventually I realized that if I skipped this 'ritual' even one day, my days felt like they got longer and I felt stressed out and irritable for the remaining of the day. Reading became not only a good habit but my mechanism for relaxing and recharging my energies during the day!

Well, this passion and appetite for reading has only gotten bigger, and with time I have to say that it has become a pretty big part of who I am today! In a way I am happy that it took me this long to get back into the habit of reading... I mean, I feel that getting older was an important part of preparing myself so that I could really appreciate John Steinbeck, Ray Bradbury and the likes of them! Would I have truly appreciated The Grapes of Wrath when I was younger? Perhaps... but it took me around 40 years to get to it and I'm happy that when it did I was able to appreciate this amazing piece of art!

These last few months I decided that I wanted to start tracking all the books that I read, buy or receive as a gift every month (see my reading progress on GoodReads and add me as a friend), and jot down some of my impressions and motives for reading or buying them. Those familiar with Nick Hornby will probably associate this post (and hopefully others that will surely come) with the work he has done writing for the Believer Magazine ... and this would be correct. My intention is not to copy his style or anything like that, but I thought that the format he chose to report on his own reading 'adventures' would fit in quite nicely with what I wanted to get across to my readers... and I'm sticking with the format as long as it works for me :)

Astro Code SchoolVideo - Interview with Lead Instructor Caleb Smith

In this Astro interview video I talk with our Lead Instructor Caleb Smith. We learn about Caleb's formal education, a connection between music and computer programming, and why teaching excites him. Caleb wrote the curriculum and teaches the Astro Code School Python & Django Web Development class.

Don't forget to subscribe to the Astro Code School YouTube channel. We have a lot more videos in the works.

Caktus GroupLightning Talk Lunch: Two Useful Organizational Tools

Monthly, we organize short Lightning Talks that take place during the lunch hour here at Caktus. Not only does this allow us a wonderful excuse to have lunch delivered from one of our many local foodie options, but it’s an excellent chance to expand our knowledge on a variety of topics. Past talks have included everything from an introduction to synthesizers and other forms of electronic music, to bug fixing, to the design inspiration behind our PyCon 2015 site.

This month, we had two talks on organizational tools for project management and resource sorting. Developer Dan Poirier gave a brief talk on Pinboard, or, as he fondly refers to it, “social bookmarking for introverts.” Essentially, Pinboard is a database for storing, organizing, and sharing links and bookmarks to articles and pages on the web. Though lacking in sharp design or beautiful layout, Pinboard is useful, highly functional, and extremely intuitive. Dan was a wonderful guide in walking us through how he uses Pinboard to store development tips and articles, as well as information related to his various projects for Caktus. He even built his own front-end for the site to help organize his finds for daily use and to share with other Caktus developers.

Our second talk came from Game Designers Edward and Lucas Rowe, who are currently finishing up the work on our Epic Allies app. Before this project, Caktus wanted to try out a new management tool for development; Epic Allies turned out to be a good fit for testing JIRA, the issue and project tracking software from Atlassian. In their talk, Lucas and Edward took us on a tour of JIRA, discussed its functionality for development projects, and showed us how Epic Allies specifically used this highly customizable platform.

All in all it was an informative day, and Dan, Edward, and Lucas may have all won a few converts to their favorite organizational tools. Now we can’t wait to see what’s in the pipeline for our next set of Lightning Talks!

Astro Code SchoolAnnouncing Caktus Scholarships for Astro Code School

We’re very pleased to announce that Caktus Group will be sponsoring up to $20,000 worth of scholarships annually for Astro Code School students. There will be twenty $1,000 scholarships. We hope that these scholarships help increase access to code schools and the wider tech industry:

Caktus Group Diversity & Veterans Scholarship

This scholarship aims to support the careers of underrepresented groups in technology, specifically women, people of color, military veterans, and people with disabilities. For classrooms and for the tech industry to be the best it can be, it requires ideas from diverse groups of people.

Caktus Group North Carolinians Scholarship

Anyone who lives in North Carolina is eligible to receive this scholarship. Caktus was founded in North Carolina and we’ve benefited from the great talent here. We want tech growth in our area to include those that live here.

You can find more information about our scholarships on the financial aid page.

Caktus GroupAnnouncing Caktus Scholarships for Astro Code School

We’re very pleased to announce that Caktus Group will be sponsoring up to $20,000 worth of scholarships for Astro Code School students per year. There will be twenty $1,000 scholarships. We hope that these scholarships help increase access to code schools and the wider tech industry:

Caktus Group Diversity & Veterans Scholarship

This scholarship aims to support the careers of underrepresented groups in technology, specifically women, people of color, military veterans, and people with disabilities. For classrooms and for the tech industry to be the best it can be, it requires ideas from diverse groups of people.

Caktus Group North Carolinians Scholarship

Anyone who lives in North Carolina is eligible to receive this scholarship. Caktus was founded in North Carolina and we’ve benefited from the great talent here. We want tech growth in our area to include those that live here.

Astro Code SchoolWhat I Learned Teaching at UNC

This spring semester, I had the honor of teaching JOMC-583 "Multimedia Programming and Production" for the University of North Carolina at Chapel Hill School of Journalism and Mass Communication. The course requires university permission and two prior multimedia programming courses that focus on frontend web development. It was a wonderful opportunity to partner with the university, especially with a department that has shown leadership in recent years with adopting innovative programs and coursework for students interested in the data-driven area of journalism.

The subject matter of the course centered around backend web development with Python and Django and also included other technologies such as git, SQL, and the Unix command line. As a rough outline, the lecture topics were:

  1. Unix command line

  2. Git and Github

  3. Python

  4. Introductory Django

  5. Django views and templates

  6. Django models and data modeling

  7. Frontend development inside a Django project

  8. Miscellaneous topics

  9. Group project time

The course materials were based on Steven King's curriculum for the course from the year prior and is available at https://github.com/calebsmith/j583

At a high-level, the first half of the course was a mixture of lecture and individual assignments while the second half of the course was spent on two projects. The first development project was completed individually and was small in scale. The second and final project was more ambitious and required collaboration using Github. This served as a nice progression from focusing on concrete skills in isolation to applying those skills and developing further experientially.

One of the group projects was deployed successfully to Heroku and is visible here: http://rackfind.herokuapp.com/

While I think the course was a major learning experience for the students, it certainly was for me as well. It was particularly interesting to see the subject areas that students picked up easily or struggled with and how this often differed with my expectations. In particular, some areas that students picked up quickly were:

  1. The essential Unix command line tools such as: pwd, ls, cd, and so on

  2. Python basics

  3. Python packaging and setup, especially pip and virtualenv

  4. Using Git as a sole contributor

  5. Creating a data model

The students were much quicker to learn these concepts than I anticipated. For instance, we spent two lecture periods focusing on developing skills for the command line, but the first class was enough for most tasks. In the future, I would likely plan on needing only one lecture for that topic.

Some topics that required more reinforcement than anticipated were:

  1. Why writing a custom backend is desirable as opposed to a static HTML site

  2. The semantics of Django URL routing.

  3. How to glue JavaScript code into Django templates

I think the fundamental reason that students struggled with this more than anticipated relates to their arrival to the domain of backend programming from a background of frontend web development.

This was a great experience for me and it was rewarding to see my students succeed in programming with Python and Django. I'm very much looking forward to more opportunities to teach web development in the future.

Astro Code SchoolPython Beginner’s Night at Astro

Last night we held the first TriPython Python Beginner’s Night. About twenty three people interested in Python attended. Many of them were very experienced developers who answered all kinds of questions. From the very basic to the advanced.

A big thanks to all the Caktus Group folks who attended. You helped a lot of people! Thanks also to the other volunteers who attended. It's really cool to live in a city with so many people who enjoy helping others.

The next free Python Beginner's Night is Monday July 6, 2015 from 6pm to 8pm here at Astro Code School (map). We'll be here on the first Monday of each month with free pizza and Python experts. If you can join us please RSVP on the Meetup page. See you soon!

Caktus GroupEpic Allies Featured at mHealth at Duke 2015 Conference

At this year’s mHealth at Duke 2015 Conference, Dr. Lisa Hightow-Weidman discussed her current mHealth projects for HIV prevention. Chief among these projects is her work with Caktus Group on Epic Allies, a mobile gaming app that utilizes social media and mini-games to increase adherence to prescribed medication amongst HIV-positive men who have sex with men (MSM).

Why this particular population? According to research, MSM account for two-thirds of all new HIV infections. In fact, they are the only risk group experiencing an increase in incidence, especially in the southern United States. With 83% of young adults using smartphones, a mobile solution is ideal for targeting at-risk youth in this particular population.

Enter Epic Allies, an adherence intervention that seeks to make taking medication fun while providing social, community support. The app combines gaming, anonymous social interactions, medication reminders, and healthy habit rewards systems to encourage adherence to treatment. The development of the app is the result of a Small Business Innovation Research Grant endowed by the National Institute of Health and was built by Caktus Group in partnership with the UNC Institute for Global Health and Infectious Diseases and the Duke Global Health Institute.


Astro Code SchoolLearn About Astro Code School Info Session

Learn About Astro Code School Info Session Join us online at 10am EDT on Thursday, June 25, 2015 for a Google Hangout information session. Caleb and I will host the hangout and talk a little bit about Astro then answer any questions you might have. Please share this post and RSVP on the Hangout page.

Caktus GroupStanford Social Innovation Review Highlights Caktus' Work in Libya

The Stanford Social Innovation Review recently featured Caktus in “Text the Vote” in Suzie Boss’ “What’s Next: New Approaches to Social Change” column. It describes how our team of developers built the world’s first SMS voter registration system in Libya using RapidSMS.

Article excerpt

In a classic leapfrogging initiative, Libya has enabled its citizens to complete voter registration via digital messaging technology.

In late 2013, soon after Vinod Kurup joined Caktus Group, an open source software firm based in Durham, N.C., he became the lead developer for a new app. The client was the government of Libya, and the purpose of the app would be to support voter registration for the 2014 national elections in that country. Bomb threats and protests in Libya made in-person registration risky. “I realized right away that this wasn’t your standard tech project,” says Kurup.

As a result of that project, Libya became the first country in the world where citizens can register to vote via SMS text messaging. By the end of 2014, 1.5 million people—nearly half of all eligible voters in Libya— had taken advantage of the Caktus-designed app during two national elections. “This never would have happened in a country like the United States, where we have established systems in place [for registering voters],” says Tobias McNulty, co-founder and CEO of Caktus. “Libya was perfect for it. They didn’t have an infrastructure. They were looking for something that could be built and deployed fast.”

To read the rest of article, visit the Stanford Social Innovation Review online.

Caktus GroupRobots Robots Ra Ra Ra!!! (PyCon 2015 Must-See Talk: 6/6)

Part six of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

I've had an interest in robotics since high school, but always thought it would be expensive and time consuming to actually do. Over the past few years, though, I've observed the rise of open hardware such as the Arduino and the Raspberry Pi, and modules and kits built on top of them, that make this type of project more affordable and accessible to the casual hobbyist. I was excited by Katherine's talk because Robot Operating System (ROS) seems to do for the software side what Arduino and such do for the hardware side.

ROS is a framework that can be used to control a wide range of robots and hardware. It abstracts away the hard work, allowing for a publish-subscribe method of communicating with your robot's subsystems. A plus side is that you can use higher-level programming languages such as Python or Lisp, not just C and C++, and there is an active and vibrant open source community built up around it already. Katherine did multiple demonstrations with a robot arm that she'd brought to the talk, that did much with a relatively small amount of easily understandable code. She showed that it was even easy to hook in OpenCV and do such things as finding a red bottle cap in the robot's field of vision.


More in the PyCon 2015 Must-See Talks Series.

Caktus GroupTesting Client-Side Applications with Django Post Mortem

I had the opportunity to give a webcast for O’Reilly Media during which I encountered a presenter’s nightmare: a broken demo. Worse than that it was a test failure in a presentation about testing. Is there any way to salvage such an epic failure?

What Happened

It was my second webcast and I chose to use the same format for both. I started with some brief introductory slides but most of the time was spent as a screen share, going through the code as well as running some commands in the terminal. Since this webcast was about testing this was mostly writing more tests and then running them. I had git branches setup for each phase of the process and for the first forty minutes this was going along great. Then it came to the grand finale. Integrate the server and client tests all together and run one last time. And it failed.

Test Failure

I quickly abandoned the idea of attempting to live debug this error and since I was at the end away I just went into my wrap up. Completely humbled and embarrassed I tried to answer the questions from the audience as gracefully as I could while inside I wanted to just curl up and hide.

Tracing the Error

The webcast was the end of the working day for me so when I was done I packed up and headed home. I had dinner with my family and tried not to obsess about what had just happened. The next morning with a clearer head I decided to dig into the problem. I had done much of the setup on my personal laptop but ran the webcast on my work laptop. Maybe there was something different about the machine setups. I ran the test again on my personal laptop. Still failed. I was sure I had tested this. Was I losing my mind?

I looked through my terminal history. There it was and I ran it again.

Single Test Passing

It passed! I’m not crazy! But what does that mean? I had run the test in isolation and it passed but when run in the full suite it failed. This points to some global shared state between tests. I took another look at the test.

import os

from django.conf import settings
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
from django.test.utils import override_settings

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.ui import WebDriverWait


@override_settings(STATICFILES_DIRS=(
    os.path.join(os.path.dirname(__file__), 'static'), ))
class QunitTests(StaticLiveServerTestCase):
    """Iteractive tests with selenium."""

    @classmethod
    def setUpClass(cls):
        cls.browser = webdriver.PhantomJS()
        super().setUpClass()

    @classmethod
    def setUpClass(cls):
        cls.browser = webdriver.PhantomJS()
        super().setUpClass()

    @classmethod
    def tearDownClass(cls):
        cls.browser.quit()
        super().tearDownClass()

    def test_qunit(self):
        """Load the QUnit tests and check for failures."""

        self.browser.get(self.live_server_url + settings.STATIC_URL + 'index.html')
        results = WebDriverWait(self.browser, 5).until(
            expected_conditions.visibility_of_element_located(
                (By.ID, 'qunit-testresult')))
        total = int(results.find_element_by_class_name('total').text)
        failed = int(results.find_element_by_class_name('failed').text)
        self.assertTrue(total and not failed, results.text)

It seemed pretty isolated to me. The test gets its own webdriver instance. There is no file system manipulation. There is no interaction with the database and even if it did Django runs each test in its own transaction and rolls it back. Maybe this shared state wasn’t in my code.

Finding a Fix

I’ll admit when people on IRC or Stackoverflow claim to have found a bug in Django my first instinct is to laugh. However, Django does have some shared state in its settings configuration. The test is using the override_settings decorator but perhaps there was something preventing it from working. I started to dig into the staticfiles code and that’s where I found it. Django was using the lru_cache decorator for the construction of the staticfiles finders. This means they were being cached after their first access. Since this test was running last in the suite it meant that the change to STATICFILES_DIRS was not taking effect. To fix my test meant that I simply needed to bust this cache at the start of my test.

...
from django.contrib.staticfiles import finders, storage
...
from django.utils.functional import empty
...
class QunitTests(StaticLiveServerTestCase):
...
    def setUp(self):
        # Clear the cache versions of the staticfiles finders and storage
        # See https://code.djangoproject.com/ticket/24197
        storage.staticfiles_storage._wrapped = empty
        finders.get_finder.cache_clear()

All Tests Passing

Fixing at the Source

Digging into this problem, it became clear that this wasn’t just a problem with the STATICFILES_DIRS setting but was a problem with using override_settings with most of the contrib.staticfiles related settings. In fact I found the easiest fix for my test case by looking at Django’s own test suite. I decided this really needed to be fixed in Django so that this issue wouldn’t bite any other developers. I opened a ticket and a few days later I created a pull request with the fix. After some helpful review from Tim Graham it was merged and was included in the recent 1.8 release.

What’s Next

Having a test which passes alone and fails when running in the suite is a very frustrating problem. It wasn’t something that I planned to demonstrate when I started with this webcast but that’s where I ended up. The problem I experienced was entirely preventable if I had prepared for the webcast better. However, my own failing lead to a great example of tracking down global state in a test suite and ultimately helped to improve my favorite web framework in just the slightest amount. All together I think it makes the webcast better than I could have planned it.

Caktus GroupTech Community Yoga Now Offered at Caktus

The Caktus office is now home to a weekly yoga class for the tech community of Durham. Via our employee suggestion box, Lead Designer Ross Pike recommended a Caktus yoga class. Through team effort that suggestion will come to fruition next week. Starting Thursday, June 11th, we will be offering a yoga class taught by professional instructor Christina Conley. The class will be open to the public at large and will be held in our community meeting space at our offices in downtown Durham.

If you are interested in joining the yoga class, you can sign up here ($8 per session): http://www.eventbrite.com/e/tech-community-yoga-class-tickets-17261719267

Also, be on the lookout for a Caktus run club in the next few weeks. Here’s to more great ideas from the suggestion box!

Caktus GroupPyLadies RDU and Astro Code School Team Up for an Intro to Django Workshop

This past Saturday, Caktus developer Rebecca Conley taught a 4-hour introductory level workshop in Django hosted by PyLadies RDU. PyLadies RDU is the local chapter of an international mentorship group for women who love coding in Python. Their main focus is to empower women to become more active participants and leaders in the Python open-source community.

The workshop was held in our Astro Code School space and sponsored by Windsor Circle, Astro Code School, and Caktus Group. Leslie Ray, the local organizer of PyLadies, is always looking for new opportunities “to create a supportive atmosphere for women to learn and teach Python.” With a strong interest in building projects in Django herself, Leslie thought an introductory workshop was the perfect offering for those looking to expand their knowledge in Python as well as a great platform from which Rebecca could solidify her own skills in the language.

“Django is practical,” explains Rebecca, “and it’s the logical next step for those with experience in Python looking to expand their toolkit.”

The event was extremely successful, with a total of thirty students in attendance. Rebecca was impressed with the students, who were “ enthusiastic and willing to work cooperatively,” which is always key in workshop environments. The class attracted everyone from undergraduates, to PhD students, to those looking into mid-career changes. In addition, she was glad to team up with PyLadies for the workshop, appreciating the group’s goal to provide a free and friendly environment for those wishing to improve and expand on their skills.

“It’s important to create new channels for individuals to explore programming. Unfortunately, the lack of diversity in tech is an indication not of who is interested in programming or technology, but of the lack of entryways into that industry. So any opportunity to widen that gateway, or to create more gateways, or to give more people the power to program is to be valued and diversity will ultimately make the field better.”It’s important to create new gateways for people to enter the field. The group of people with interest in and aptitude for programming is large and diverse, and diversity will make this field better. It’s up to those of us already in the field to open more doors and actively welcome and support people when they come in.”

For more information on PyLadies and their local programming, be sure to join their Meetup page, follow them on Twitter, or check out the international PyLadies group page. Other local groups that provide opportunities to code and that we’re proud sponsors of include Girl Develop It! RDU, TriPython, and Code for Durham. For women in tech seeking career support, Caktus also founded Durham Women in Tech.

Astro Code SchoolVideo - Conditionals in Python

In Caleb Smith's third video in our series about beginning Python he shows you comparison operators, input(), print(), indentation and if statements in Python. Use http://repl.it/languages/Python3 to follow along in the browser.

Don't forget to subscribe to the Astro Code School YouTube channel. We have a lot more videos in the works.

Astro Code SchoolVideo - Using repl.it with Python 3

This is Caleb Smith's second video in our series about beginning Python. It shows you how to use the web based Python shell and text editor repl.it. Use http://repl.it/languages/Python3 to follow along in the browser.

Don't forget to subscribe to the Astro Code School YouTube channel. We have a lot more videos in the works.

Astro Code SchoolVideo - Very First Steps with Python

This is Caleb Smith's first video in our series about beginning Python. It introduces some fundamentals of programming in Python. Topics for this video include data values, types, basic operators and variables.

Don't forget to subscribe to the Astro Code School YouTube channel. We have a lot more videos in the works.

Caktus GroupCreating and Using Open Source: A Guide for ICT4D Managers

Choosing an open source product or platform upon which to build an ICT4D service is hard. Creating a sustainable, volunteer-driven open source project is even harder. There is a proliferation of open source tools in the world, but the messaging used to describe a given project does not always line up with the underlying technology. For example, the project may make claims about modularity or pluggability that, upon further investigation, prove to be exaggerations at best. Similarly, managers of ICT4D projects may be attracted to Open Source because of the promise of a “free” product, but as we’ve learned through trial and error at Caktus, it’s not always less costly to adapt an existing open source project than it would be to engineer a quality system from the ground up.

In this post I will go over some of the criteria we look at when evaluating a new open source project, from a developer’s perspective, in the hopes that it helps managers of ICT4D projects make educated decisions about when it makes sense to adopt a pre-existing open source solution. For those ICT4D managers looking to release a new open source platform, what follows may also prove helpful when deciding how best to allocate resources to the initial release and ongoing management of an open source product or platform. To that end, I’ll provide a high level overview of what matters most: licensing, code quality assessments, automated testing, development workflow, documentation, release frequency, and community engagement.

The three things that are most important to ICT4D projects, I would argue, are quick iteration, replicability, and scalability. Quick iteration is required in order to get early drafts of solutions out in front of beneficiaries to pilot as quickly as possible. Replicability is important when a pilot project is ready to be tested in multiple locations. Similarly, once a pilot has been shown to be successful, the ability to quickly scale up that project to meet regional, national, or even international demand is critical.

The problem is that these three success factors often place competing demands on the project. Doing things the quick and dirty way may be perceived as shortening the time to a working solution, but it also means the solution might not work in other contexts. Similarly, the project might hit a technical barrier when it comes time to scale up. With proper planning and execution, however, I believe all three of these — quick iteration, replicability, and scalability — can be achieved in a way that does not require compromises nor starting over from scratch when it comes time to replicate or scale an ICT4D project. Furthermore, we believe strongly at Caktus that doing things the right way the first time minimizes both risk and the time to develop a software project, even for quick, iterative pilots.

Selecting permissive licenses lowers the barrier to entry

There are many types and subtypes of open source licensing, and trying to select a project based on a license can easily get confusing. Generally speaking, we opt for the more permissive BSD- or MIT-style licenses at Caktus when we have the choice. The main thing to consider when using software with more restrictive licenses such as the GPL or AGPL is that they tend to be less business- or donor-friendly and hence may attract a smaller overall community than they would have otherwise. They can also add requirements that your project might not otherwise have had, such as open-sourcing it.

Creating code readable by humans improves scalability

Code quality is something that is easy to forget about early in a project. ICT4D pilots are often like startups: the drive is to get features out the door as quickly as possible to test and prove the minimum viable product (MVP). We believe you can produce work that is both speedily deployed and later easy to scale by focusing on code quality from the start. In software development there is a concept of “technical debt:” Moving quickly without concern for quality creates “debt” that must be paid back, with interest accruing over time.

Code quality includes creating code that is readable to fellow developers. Like any language, clarity for other people reading it matters. At Caktus our preference generally tends to be for the Python programming language because it is well known for being highly readable and easy to learn.

For those ICT4D program managers starting new projects, regardless of the programming language, it’s helpful to build in time for the development team to add automated checks to the project that enforce a code formatting standard. For those evaluating a new open source solution, apart from reviewing the code itself, ICT4D program managers can check for the existence of documented coding standards. The end goal is for all developers on a project is to write code that is indistinguishable from another developer’s code; you should not be able to tell from looking at a piece of code who wrote it. This makes it easier both to bring new people into the project and for a developer jump into a part of the code he or she didn’t write, in case the person who wrote it happens to be inaccessible at the time an urgent change is needed. The code should be the product of the team, not a set of disparate individuals, and having code formatting standards in place helps encourage that. At Caktus, we typically use flake8 (run via Travis CI) to check the format of our code automatically each time a developer makes a commit or submits a pull request.

Automated code testing ensures reliability

Automated code testing is both best practice and necessary to avoid software failures, but we have seen it dismissed in the rush to deploy. The key concepts for ICT4D program managers to consider in the planning process is what kind of automated testing developers are using. Automated testing includes both “unit” and “integration” testing. “Unit tests” are pieces of code that individually test discrete parts of the overall code base to ensure they continue to work as expected as changes are made to the system. “Integration tests,” similarly, verify that the different components function when combined into a complete system. The end goal of both types of tests is the same: to ensure that the existing software does not break as features are added or changed or bugs are fixed. Absent automated tests, it’s all too easy for something as small as a bug fix to introduce one or more new, unanticipated bugs in other parts of the system.

At Caktus we primarily use Django’s testing framework, which is based on unittest framework in Python. We also set up Continuous Integration to run tests on every set of changes automatically and email the developers when tests fail, so the team is always aware when the tests aren’t passing. When evaluating whether or not a project relies heavily on automated testing, two things to look for are (a) whether or not the project advertises test coverage (as a percentage, at least 85-90% is preferred), and (b) whether or not the development process requires new features to come bundled with unit tests. As with code quality, if automated tests are left out of a project, I would argue that the time to develop the project will actually increase rather than decrease because the development team will end up spending time tracking down bugs that would have been caught by the testing framework, time that could have been spent developing features.

A documented development workflow streamlines new contributions

The development workflow is another important part of any software project, in particular open source projects. Open source projects should have a clearly documented, community supported method for (a) proposing and discussing potential features or other changes, (b) developing those changes, (c) having those changes reviewed and approved by other developers, (d) merging those changes into the main branch(es), and (e) releasing sets of those changes as numbered releases (e.g., v1.2). Whether a project has these things documented can usually be discovered easily by searching for a “developer manual” or “contributors guide,” as well as reviewing the content of the project’s developer mailing list to see evidence of how contributions work in practice. This documentation acts as a clear entry point for both users and developers without which open source projects wither.

At Caktus we typically use a variant of the GitHub Flow model that includes one additional “staging” or “develop” branch that is used to deploy the code to an intermediary “staging” server. This allows code to be tested before being deployed to the production server. A key part of this workflow is the peer code review, a process by which a fellow developer reviews every new change. Not only does the process help detect potential issues early, it also broadens overall knowledge of the code base. Code reviews can’t be done intermittently or when it’s convenient, but should be done for every change being made to the project. We believe creating a culture of code reviews allows individual developers to forgo ego in favor of a drive towards system integrity. One can evaluate whether a project does code reviews by checking a number of places, including the project developer mailing list, the GitHub or BitBucket “pull requests” feature which allows line-by-line reviews, or simply by reviewing the commit log to see if changes are made directly to the “master” or “default” branch or if they’re made to separate “feature” branches first.

Clear documentation helps create sustainable open source projects

Good documentation is fundamental to any successful open source project. Perhaps counter intuitively, it’s just as easy to have too much documentation as it is to have too little. Signs that an open source project takes documentation seriously include things like how often the documentation is referenced on the project’s mailing list(s), where the documentation is stored, how the documentation is edited, and how easy the documentation makes it, both for new users and developers of the project, to come on board. While not always the case, documentation that is automatically generated by the code can be a case of “too much” rather than “good” documentation. Jacob Kaplan-Moss of the Django project wrote a great blog post back in 2009 on writing good technical documentation that is worth a read for anyone putting together documentation for an open source project.

At Caktus we generally have a preference for storing developer-written documentation in the code repository itself; this allows the team to quickly update documentation when code changes are made, and also makes it easy to spot discrepancies between code changes and documentation changes when doing code reviews. While wikis may be easy to update, they tend to fall out of sync with the code because updating them happens as part of a different process. Hosting documentation in a wiki also makes it harder to refer back to older versions of the documentation if you have a system that’s been running for a few years and have not been able to upgrade the underlying platform.

Regular releases and recent “commits” help ensure continuity

One of the first things we tend to look at (in part because it’s one of the easiest) is to check how recently the project we’re evaluating released a new version and/or how recently someone committed new changes to the code. While it’s not always a bad sign if there hasn’t been a release in a year or two, it’s generally better to find projects that have regular releases of at least 2-3 times a year. It can also be a bad sign, for example, if there are lots of frequent commits to the code repository, but the last “released” (numbered) version is many months or years old. This may mean that the release management has fallen off track, and the project is targeting only internal users rather than the larger open source / ICT4D community.

Developer community engagement necessary to leverage the power of open source

Community engagement and openness are two more important factors to consider when selecting an open source project as the foundation for (or to add to) an ICT4D solution. Community engagement matters because projects without a community of users and contributors tend not to be maintained over the long run. Engagement of the community can be evaluated by reviewing traffic on the project’s mailing list(s) and bug tracker (for both users and developers) and determining the prevailing character of project communications. Key events to look for include the usual response when someone enters a bug report, submits a suggested change or pull request, or proposes a discussion around the project’s development workflow. While reasonable demands can (and should) be placed on new users for following protocol, a high number of rejected changes or disgruntled first-time users tends to be an indicator of poor community relations. These are some of the reasons why we’re big proponents of the Django framework: the community is almost always warm and welcoming and is quick to enforce this culture. In addition to communications, other positive attributes to look for include documentation around adding new members to the core development team as well as codes of conduct or other policies that set forth in a public way the desire to create an inclusive community for all. These things matter because developers are people, and communication -- as in any discipline -- is critical.

Conclusion

While by no means an all-inclusive list, these are some of factors I think it’s important to consider when selecting a new open source product to use for an ICT4D solution. I hope to have provided useful insight into the developer’s perspective, one that I think ICT4D program managers should consider when evaluating open source projects. I realize selecting projects that hold themselves to the highest standard on all of these points may be a difficult task, so as with many things deficiencies in one area may be made up for with excellence in others. Similarly, implementing all of the above points on an open source project you release will not result in a sudden wave of contributions from volunteer developers, but the more you can do the more you’ll lower the barrier to entry for developers and facilitate community growth.

I hope to update this post from time to time with new ideas and approaches for evaluating open source projects for use in ICT4D, so if you have any questions, comments, or suggested additions, please leave them in the comments section below. I look forward to your feedback!

Caktus GroupDurham Women in Tech (D-WiT) Starts Strong

This past Tuesday we held our very first gathering for the new Durham Women in Tech (D-WiT) Meetup group. There was a huge turnout and a lot of enthusiasm for the community we’re seeking to support and build. It was particularly wonderful to see our recently opened Astro Code School space full of people.

We began with a short mingling period. I loved hearing everyone’s stories as to why they had come. I met a wide variety of women involved or interested in tech, from students just learning to code and looking for more support in that arena, to professionals with long careers hoping to learn effective methods for shaping a more inclusive culture within the tech industry.

Hao delivered a short presentation on the evening’s topic: imposter syndrome, or the feeling that you’ve flown in under the radar and are about to be found out. The feelings of incompetency and anxiety it evokes can be triggered by doing something new, a tendency towards perfectionism, or being different from those around you. For women—and especially for women of color—being different is often a de facto situation in a male-dominated field.

More important than the discussion of what imposter syndrome is, was the discussion of how to combat it. Attendees split into four groups to offer their own personal experiences with imposter syndrome as well as the tools and methods they’ve developed for resisting it. It was such a rewarding experience to walk away with viable solutions and methods for learning to internalize one’s achievements.

Our next meeting will be in July, and I don’t think I’m alone in my excitement to meet again with this new circle of support within the local tech community.

Astro Code SchoolAstro Launches in Durham

Astro Code School Director Brian Russell tells Durham Mayor Bill Bell about the school

Friday May 1 we held our launch party. A lot people showed up to welcome Astro Code School to Durham and learn about what we do. I had a great time telling our story to guests. Plus it was fun to meet Mayor Bell!

As a resident of the City of Durham I love working Downtown. It's close to where I live, convenient to a lot of great food and drink, and a great place to run into cool people all the time. I feel as if I'm part of something really awesome at a cool time in Durham history.

Astro's mission to educate people really fits well with a community who's committed to serving others. I first learned about this awesome attribute of Durhamites from friends who work at local non-profits. Inspired by them I joined AmeriCorp in 2004 as a technology VISTA at the Durham Literacy Center. This experience gave me quite an education and was a big influence on me.

A giant thanks to all the people at Caktus Consulting Group who helped organized the event. Without them it wouldn’t have been possible.

Caktus CTO Colin Copeland, Durham Mayor Bill Bell, and Caktus CBO Alex Leman

We’re right downtown Durham at 108 Morris Street. I hope that when you have a moment you'll stop in and say hi.

Caktus GroupCakti at CRS ICT4D 2015

This is Caktus’ first year taking part in the Catholic Relief Service’s (CRS) Information and Communication Technologies for Development (ICT4D) conference. The theme of this year’s conference is increasing the impact of aid and development tools through innovation. We’re especially looking forward to all of the speakers from organizations like the International Rescue Committee, USAID, World Vision, and the American Red Cross. In fact, the offerings are so vast, we thought we would provide a little cheat sheet to help you find Cakti throughout this year’s conference.

Wednesday, May 27th

How SMS Powers Democracy in Libya Vinod Kurup will explain how Caktus used Rapid SMS, a Django-based SMS framework, to build the world’s first voter registration system in Libya.

Commodity Tracking System (CTS): Tracking Distribution of Commodities Jack Byrne of the International Rescue Committee(IRC) is the Syria Response Director. He will present on the Caktus-built system IRC uses to track humanitarian aid for Syrian refugees.

Friday, May 29th

Before the Pilot: Planning for Scale Caktus’ CTO Colin Copeland will be part of a panel discussion on what technology concepts matter most at the start of a project and the various challenges of pilot programs. Also on the panel will be Jake Watson of IRC and Jeff Wishnie of MercyCorps. Hao Nguyen, Caktus’ Strategy Director, will moderate.

Leveraging the Open Source Community for Truly Sustainable ICT4D CEO Tobias McNulty will provide his insider’s perspective on the open source community and how to best use that community in the development of ICT4D tools and solutions.

Wednesday, Thursday, and Friday

Throughout the conference you can stop by the Caktus booth to read more about our ICT4D projects and services, meet Cakti, or play one of the mini-games from our Epic Allies app.

Not attending the conference? You can follow @caktusgroup and #ICT4D2015 for live updates!

Caktus GroupPyPy.js: What? How? Why? by Ryan Kelly (PyCon 2015 Must-See Talk: 5/6)

Part five of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

From Ryan Kelly's talk I learned that it is actually possible, today, to run Python in a web browser (not something that interprets Python-like syntax and translates it into JavaScript, but an actual Python interpreter!). PyPy.js combines two technologies, PyPy (the Python interpreter written in Python) and Emscripten (an LLVM-to-JavaScript converter, typically used for getting games running in the browser), to run PyPy in the browser. This talk is a must-see for anyone who's longed before to write client-side Python instead of JavaScript for a web app. While realistically being able to do this in production may still be a ways off, at least in part due to the multiple megabytes of JavaScript one needs to download to get it working, I enjoyed the view Ryan's talk provided into the internals of this project. PyPy itself is always fascinating, and this talk made it even more so.


More in the PyCon 2015 Must-See Talks Series.

Caktus GroupAnnouncing the New Durham Women in Tech (DWiT) Meetup

We’re pleased to officially announce the launch of a new meetup: Durham Women in Tech (DWiT). Through group discussions, lectures, panels, and social gatherings, we hope to provide a safe space for women in small and medium-sized Durham tech firms to share challenges, ideas, and solutions. We especially want to support women on the business side in roles such as operations, marketing, business development, finance, and project management.

A small group of us at Caktus decided to start DWiT after being unable to find a local group for those in similar positions to us: we work on the business side and, as part of a growing company, wear many hats. Our roles often include implementing new processes and policies, tasks that influence culture and corporate direction. We have a seat at the table, but it’s not always clear how to help our companies move forward. How do we work towards removing the barriers women face in the tech industry within our roles? How do we help ourselves and our teams when faced with gendered challenges?

By pulling together a group of similar women, we hope to pool everyone’s experiences into a shared resource. We’ve seen the power of communities for female developers through the organizations Caktus supports internationally and locally with mentors and sponsorship, including, amongst others, Girl Develop It RDU, PyLadies RDU, DjangoGirls, and Pearl Hacks. We’re looking forward to strengthening the resources for women in technology in Durham.

Our inaugural meeting is on Tuesday, May 26th at 6 pm. We will be discussing imposter syndrome, a name given for those unfortunate moments where one feels like an imposter, despite external evidence to the contrary. RSVP by joining our meetup group.

Caktus GroupKeynote by Catherine Bracy (PyCon 2015 Must-See Talk: 4/6)

Part four of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

My recommendation would be Catherine Bracy's Keynote about Code for America. Cakti should be familiar with Code for America. Colin Copeland, Caktus CTO, is the founder of Code for Durham and many of us are members. Her talk made it clear how important this work is. She was funny, straight-talking, and inspirational. For a long time before I joined Caktus, I was a "hobbyist" programmer. I often had time to program, but wasn't sure what to build or make. Code for America is a great opportunity for people to contribute to something that will benefit all of us. I have joined Code for America and hope to contribute locally soon through Code for Durham.


More in the PyCon 2015 Must-See Talks Series.

Caktus GroupQ2 2015 ShipIt Day ReCap

Last Friday everyone at Caktus set aside their regular client projects for our quarterly ShipIt Day, a chance for Caktus employees to take some time for personal development and independent projects. People work individually or in groups to flex their creativity, tackle interesting problems, or expand their personal knowledge. This quarter’s ShipIt Day saw everything from game development to Bokeh data visualization, Lego robots to superhero animation. Read more about the various projects from our Q2 2015 ShipIt Day.


Victor worked on our version of Ultimate Tic Tac Toe, a hit at PyCon 2015. He added in Jeff Bradbury’s artificial intelligence component. Now you can play against the computer! Victor also cleaned up the code and open sourced the project, now available here: github.com/caktus/ultimatetictactoe.

Philip dove into @total_ordering, a Python feature that fills in defining methods for sorting classes. Philip was curious as to why @total_ordering is necessary, and what might be the consequences of NOT using it. He discovered that though it is helpful in defining sorting classes, it is not as helpful as one would expect. In fact, rather than speeding things up, adding @total_ordering actually slows things down. But, he concluded, you should still use it to cover certain edge cases.

Karen updated our project template, the foundation for nearly all Caktus projects. The features she worked on will save us all a lot of time and daily annoyance. These included pulling DB from deployed environments, refreshing the staging environment from production, and more.

Erin explored Bokeh, a Python interactive data visualization library. She initially learned about building visualizations without javascript during PyCon (check out the video she recommended by Sarah Bird). She used Bokeh and the Google API to display data points on a map of Africa for potential use in one of our social impact projects.

Jeff B worked on Lisp implementation in Python. PyPy is written in a restricted version of Python (called RPython) and compiled down into highly efficient C or machine code. By implementing a toy version of Lisp on top of PyPy machinery, Jeff learned about how PyPy works.

Calvin and Colin built the beginnings of a live style guide into Caktus’ Django-project-template. The plan was loosely inspired by Mail Chimp's public style guide. They hope to eventually have a comprehensive guide of front-end elements to work with. Caktus will then be able to plug these elements in when building new client projects. This kind of design library should help things run smoothly between developers and the design team for front-end development.

Neil experimented with Mercury hoping the speed of the language would be a good addition to the Caktus toolkit. He then transitioned to building a project in Elm. He was able to develop some great looking hexagonal data visualizations. Most memorable was probably the final line of his presentation: “I was hoping to do more, but it turns out that teaching yourself a new programming language in six hours is really hard.” All Cakti developers nodded and smiled knowingly.

Caleb used Erlang and cowboy to build a small REST API. With more time, he hopes to provide a REST API that will provide geospatial searches for points of interest. This involves creating spatial indexes in Erlang’s built-in Mnesia database using geohashes.

Mark explored some of the issues raised in the Django-project-template and developed various fixes for them, including the way secrets are managed. Now anything that needs to be encrypted is encrypted with a public key generated when you bring up the SALT master. This fixes a very practical problem in the development workflow. He also developed a Django-project-template Heroku-style deploy, setting up a proof of concept project with a “git push” to deploy workflow.

Vinod took the time to read fellow developer Mark Lavin’s book Lightweight Django while I took up DRiVE by Daniel H. Pink to read about what motivates people to do good work or even complete rote tasks.

Scott worked with Dan to compare Salt states to Ansible playbooks. In addition, Dan took a look at Ember, working with the new framework as a potential for front-end app development. He built two simple apps, one for organizing albums in a playlist, and one for to-do lists. He had a lot of fun experimenting and working with the new framework.

Edward and Lucas built a minigame for our Epic Allies app. It was a fun, multi-slot, pinball machine game built with Unity3D.

Hunter built an HTML5 game using Phaser.js. Though he didn’t have the time to make a fully fledged video game, he did develop a fun looking boardgame with different characters, abilities, and animations.

NC developed several animations depicting running and jumping to be used to animate the superheros in our Epic Allies app. She loved learning about human movement, how to create realistic animations, and outputting the files in ways that will be useful to the rest of the Epic Allies team.

Wray showed us an ongoing project of his: a front-end framework called sassless, “the smallest CSS framework available.” It consists of front-end elements that allow you to set up a page in fractions so that they stay in position when resizing a browser window (to a point) rather than the elements stacking. In other words, you can build a responsive layout with a very lightweight CSS framework.

One of the most enertaining projects of the day was the collaboration between Rebecca C and Rob, who programmed Lego-bots to dance in a synced routine using the Lego NXT software. Aside from being a lot of fun to watch robots (and coworkers) dance, the presence of programmable Lego-bots prompted a much welcome visit from Calvin’s son Caelan, who at age of 9 is already learning to code!

Caktus GroupInteractive Data for the Web by Sarah Bird (PyCon 2015 Must-See Talk: 3/6)

Part three of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

Sarah Bird's talk made me excited to try the Bokeh tutorials. The Bokeh library has very approachable methods for creating data visualizations inside of Canvas elements all via Python. No javascript necessary. Who should see this talk? Python developers who want to add a beautiful data visualization to their websites without writing any javascript. Also, Django developers who would like to use QuerySets to create data visualizations should watch the entire video, and then rewind to minute 8:50 for instructions on how to use Django QuerySets with a couple of lines of code.

After the talk, I wanted to build my own data visualization map of the world with plot points for one of my current Caktus projects. I followed up with one of the friendly developers from Continuum Analytics to find out that you do not need to spin up a separate Bokeh server to get your data visualizations running via Bokeh.

Astro Code SchoolFall Registration Now Open

Registration for the fall Python & Django Web Engineering class is open. You can fill out the application form on the Apply page and get more details on the application Process page. The deadline for applying is August 24, 2015. You can find a full syllabus for this class over on it's page be102.

This class is twelve weeks long and full time Monday to Friday from 9 AM – 5 PM. It'll be taught here at the Astro Code School at 108 Morris Street, Suite 1b, Durham, NC.

Python and Django make a powerful team to build maintainable web applications quickly. When you take this course you will build your own web application during lab time with assistance from your teacher and professional Django developers. You’ll also receive help preparing your portfolio and resume to find a job using the skills you’ve learned.

Please contact me if you have any questions.

Caktus GroupCakti Comment on Django's Class-based Views

After PyCon 2015, we were surprised when we realized how many Cakti who attended had all been asked about Django's class-based views (CBVs). We talked about why this might be, and this is a summary of what we came up with.

Lead Front-End Developer Calvin Spealman has noticed that there are many more tutorials on how to use CBVs than on how to decide whether to use them.

Astro Code School Lead Instructor Caleb Smith reminded us that while "less code" is sometimes given as an advantage of using CBVs, it really depends on what you're doing. Each case is different.

I pointed out that there seem to be some common misconceptions about CBVs.

Misconception: Functional views are deprecated and we're all supposed to be writing class-based views now.

Fact: Functional views are fully supported and not going anywhere. In many cases, they're a good choice.

Misconception: CBVs means using the generic class-based views that Django provides.

Fact: You can use as much or as little of Django's generic views as you like, and still be using class-based views. I like Vanilla Views as a simpler, easier to understand alternative to Django's generic views that still gives all the advantages of class-based views.

So, when to use class-based views? We decided the most common reason is if you want to reuse code across views. This is common, for example, when building APIs.

Caktus Technical Director Mark Lavin has a simple answer: "I default to writing functions and refactor to classes when needed writing Python. That doesn't change just because it's a Django view."

On the other hand, Developer Rebecca Muraya and I tend to just start with CBVs, since if the view will ever need to be refactored that will be a lot easier if it was split up into smaller bits from the beginning. And so many views fall into the standard patterns of Browse, Read, Edit, Add, and Delete that you can often implement them very quickly by taking advantage of a library of common CBVs. But I'll fall back to Mark's system of starting with a functional view when I'm building something that has pretty unique behavior.

Tim HopperHow I Became a Data Scientist Despite Having Been a Math Major

Caution: the following post is laden with qualitative extrapolation of anecdotes and impressions. Perhaps ironically (though perhaps not), it is not a data driven approach to measuring the efficacy of math majors as data scientists. If you have a differing opinion, I would greatly appreciate you to carefully articulate it and share it with the world.

I recently started my third "real" job since finishing school; at my first and third jobs I have been a "data scientist". I was a math major in college (and pretty good at it) and spent a year in the math Ph.D. program at the University of Virginia (and performed well there as well). These two facts alone would not have equipped me for a career in data science. In fact, it remains unclear to me that those two facts alone would have prepared me for any career (with the possible exception of teaching) without significantly more training.

When I was in college Business Week published an article declaring "There has never been a better time to be a mathematician." At the time, I saw an enormous disconnect between the piece and what I was being taught in math classes (and thus what I considered to be a "mathematician"). I have come across other pieces lauding this as the age of the mathematicians, and more often than not, I've wondered if the author knew what students actually studied in math departments.

The math courses I had as an undergraduate were:

  • Linear algebra
  • Discrete math
  • Differential equations (ODEs and numerical)
  • Theory of statistics 1
  • Numerical analysis 1 (numerical linear algebra) and 2 (quadrature)
  • Abstract algebra
  • Number theory
  • Real analysis
  • Complex analysis
  • Intermediate analysis (point set topology)

My program also required a one semester intro to C++ and two semesters of freshman physics. In my year as a math Ph.D. student, I took analysis, algebra, and topology classes; had I stayed in the program, my future coursework would have been similar: pure math where homework problems consistent almost exclusively of proofs done with pen and paper (or in LaTeX).

Though my current position occasionally requires mathematical proof, I suspect that is rare among data scientist. While the "data science" demarcation problem is challenging (and I will not seek to solve it here), it seems evident that my curriculum lacked preparation in many essential areas of data science. Chief among these are programming skill, knowledge of experimental statistics, and experience with math modeling.

Few would argue that programming ability is not a key skill of data science. As Drew Conway has argued, a data scientist need not have a degree in computer science, but "Being able to manipulate text files at the command-line, understanding vectorized operations, thinking algorithmically; these are the hacking skills that make for a successful data hacker." Many of my undergrad peers, having briefly seen C++ freshman year and occasionally used Mathematica to solve ODEs for homework assignments, would have been unaware that manipulation of a file from the command-line was even possibile, much less have been able to write a simple sed script; there was little difference with my grad school classmates.

Many data science positions require even more than the ability to solve problems with code. As Trey Causey has recently explained, many positions require understanding of software engineering skills and tools such as writing reusable code, using version control, software testing, and logging. Though I gained a fair bit of programming skill in college, these skills, now essential in my daily work, remained foreign to me until years later.

My math training had a lack of statistics courses. Though my brief exposure to mathematical statistics has been valuable in picking up machine learning, experimental statistics was missing altogether. Many data science teams are interested in questions of causal inference and design and analysis of experiments; some would make these essential skills for a data scientist. I learned nothing about these topics in math departments. Moreover, machine learning, also a cornerstone of data science, is not a subject I could have even defined until after I was finished with my math coursework; at the end of college, I would have said artificial intelligence was mostly about rule-based systems in Lisp and Prolog.

Yet even if statistics had play a more prominent role in my coursework, those who have studied statistics know there is often a gulf between understanding textbook statistics and being able to effectively apply statistical models and methods to real world problems. This is only an aspect of a bigger issue: mathematical (including statistical) modeling is an extraordinarily challenging problem, but instruction on effectively model real world problems is absent from many math programs. To this day, defining my problem in mathematical terms one of the hardest problems I face; I am certain that I am not alone on this. Though I am now armed with a wide variety of mathematical models, it is rarely clear exactly which model can or should be applied in a given situation.

I suspect that many people, even technical people, are uncertain as to what academic math is beyond undergraduate calculus. Mathematicians mostly work in the logical manipulation of abstractly defined structures. These structures rarely bear any necessary relationship to physical entities or data sets outside the abstractly defined domain of discourse. Though some might argue I am speaking only of "pure" mathematics, this is often true of what is formally known as "applied mathematics". John D. Cook has made similar observations about the limitations of pure and applied math (as proper disciplines) in dubbing himself a "very applied mathematician". Very applied mathematics is "an interest in the grubby work required to see the math actually used and a willingness to carry it out. This involves not just math but also computing, consulting, managing, marketing, etc." These skills are conspicuously absent from most math curricula I am familiar with.

Given this description of how my schooling left me woefully unprepared for a career in data science, one might ask how I have had two jobs with that title. I can think of several (though probably not all) reasons.

First, the academic study of mathematics provides much of the theoretical underpinnings of data science. Mathematics underlies the study of machine learning, statistics, optimization, data structures, analysis of algorithms, computer architecture, and other important aspects of data science. Knowledge of mathematics (potentially) allows the learner to more quickly grasp each of these fields. For example, learning how principle component analysis—a math model that can be applied and interpreted by someone without formal mathematical training—works will be significantly easier for someone with earlier exposure linear algebra. On a meta-level, training in mathematics forces students to think carefully and solve hard problems; these skills are valuable in many fields, including data science.

My second reason is connect to the first: I unwittingly took a number of courses that later played important roles in my data science toolkit. For example, my current work in Bayesian inference has been made possible by my knowledge of linear algebra, numerical analysis, stochastic processes, measure theory, and mathematical statistics.

Third, I did a minor in computer science as an undergraduate. That provided a solid foundation for me when I decided to get serious about building programming skill in 2010. Though my academic exposure to computer science lacked any software engineer skills, I left college with a solid grasp of basic data structures, analysis of algorithms, complexity theory, and a handful of programming languages.

Fourth, I did a masters degree in operations research (after my year as a math PhD student convinced me pure math wasn't for me). This provided me with experience in math modeling, a broad knowledge of mathematical optimization (central to machine learning), and the opportunity to take graduate-level machine learning classes.1

Fifth, my insatiable curiosity in computers and problem solving has played a key role in my career success. Eager to learn something about computer programming, I taught myself PHP and SQL as a high school student (to make Tolkien fan sites, incidentally). Having been given small Mathematica-based homework assignments in freshman differential equations, I bought and read a book on programming Mathematica. Throughout college and grad school, I often tried—and sometimes succeeded—to write programs to solve homework problems that professors expected to be solved by hand. This curiosity has proven valuable time and time again as I've been required to learn new skills and solve technical problems of all varieties. I'm comfortable jumping in to solve a new problem at work, because I've been doing that on my own time for fifteen years.

Sixth, I have been been fortunate enough to have employers who have patiently taught me and given me the freedom to learn on my own. I have learned an enormous amount in my two and a half year professional career, and I don't anticipate slowing down any time soon. As Mat Kelcey has said: always be sure you're not the smartest one in the room. I am very thankful for three jobs where I've been surrounded by smart people who have taught me a lot, and for supervisors who trust me enough to let me learn on my own.

Finally,4 it would be hard for me to overvalue the four and a half years of participation in the data science community on Twitter. Through Twitter, I have the ear of some of data science's brightest minds (most of whom I've never met in person), and I've built a peer network that has helped me find my current and last job. However, I mostly want to emphasize the pedagogical value of Twitter. Every day, I'm updated on the release of new software tools for data science, the best new blog posts for our field, and the musings of of some of my data science heros. Of course, I don't read every blog post or learn every software tool. But Twitter helps me to recognize which posts are most worth my time, and because of Twitter, I know something instead of nothing about Theano, Scalding, and dplyr.2

I don't know to what extent my experience generalizes3, in either the limitations of my education or my analysis of my success, but I am obviously not going to let that stop me from drawing some general conclusions.

For those hiring data scientists, recognize that mathematics as taught might not be the same mathematics you need from your team. Plenty of people with PhDs in mathematics would be unable to define linear regression or bloom filters. At the same time, recognize that math majors are taught to think well and solve hard problems; these skills shouldn't be undervalued. Math majors are also experienced in reading and learning math! They may be able to read academic papers and understand difficult (even if new) mathematical more quickly than a computer scientist or social scientist. Given enough practice and training, they would probably be excellent programmers.

For those studying math, recognize that the field you love, in its formal sense, may be keeping you away from enjoyable and lucrative careers. Most of your math professors have spent their adult lives solving math problems on paper or on a chalkboard. They are inexperienced and, possibly, unknowledgeable about very applied mathematics. A successful career in pure mathematics will be very hard and will require you to be very good. While there seem to be lots of jobs in teaching, they will rarely pay well. If you're still an student, you have a great opportunity to take control of your career path. Consider taking computer science classes (e.g. data structures, algorithms, software engeering, machine learning) and statistics classes (e.g. experimental design, data analysis, data mining). For both students and graduates, recognize your math knowledge becomes very marketable when combined skills such as programming and machine learning; there are a wealth of good books, MOOCs, and blog posts that can help you learn these things. More over, the barrier to entry for getting started with production quality tools has never been lower. Don't let your coursework be the extent of your education. There is so much more to learn!5


  1. At the same time, my academic training in operations research failed me, in some aspects, for a successful career in operations research. For example, practical math modeling was not sufficiently emphasized and the skills of computer programming and software development were undervalued. 

  2. I have successfully answered more than one interview question by regurgitating knowledge gleaned from tweets. 

  3. Among other reasons, I didn't really plan to get where I am today. I changed majors no fewer than three times in college (physics, CS, and math) and essentially dropped out of two PhD programs! 

  4. Of course, I have plenty of data science skills left to learn. My knowledge of experimental design is still pretty fuzzy. I still struggle with effective mathematical modeling. I haven't deployed a large scale machine learning system to production. I suck at software logging. I have no idea how deep learning works. 

  5. For example, install Anaconda and start playing with some of these IPython notebooks

Tim HopperPublishing a Static Site Generator from iOS

A few weeks ago, I setup Travis CI so this Pelican-based blog will publish itself when I commit a new post to Github.

At the time, I asked on Twitter if there were any good Git clients that would allow me to push new posts from my iPad; I didn't get any promising replies.

However, I just found out about an app called Working Copy "a powerful Git client for iOS 8 that clones, edits, commits, pushes, and more."

I just cloned my Stigler Diet repo on my iPad, and I'm composing this post from the Whole Foods cafe on my iPad. If you're reading this post, it's because I successfully published it from here!

Astro Code SchoolVideo - Tips for Using Generators in Python

Here's the third screencast video in Caleb Smith's series about functional programming in Python. This one describes generators, iterators and iterables in Python with some tips on how to implement generators.

Don't forget to subscribe to the Astro Code School YouTube channel. Lots more educational screencasts to come.

Caktus GroupBeyond PEP 8 by Raymond Hettinger (PyCon 2015 Must-See Talk: 2/6)

Part two of six in our PyCon 2015 Must-See Series, a weekly highlight of talks our staff enjoyed at PyCon.

I think everyone who codes in any language and uses any automated PEP-8 or linter sort of code checker should watch this talk. Unfortunately to go into any detail on what I learned (or really was reminded of) would ruin the effect of actually watching the talk. I'd encourage everyone to watch it. I came away from the talk wanting to figure out a way to incorporate its lesson into our Caktus development practices.

Frank WierzbickiJython 2.7.0 final released!

On behalf of the Jython development team, I'm pleased to announce that the final release of Jython 2.7.0 is available! It's been a long road to get to 2.7, and it's finally here! I'd like to thank Amobee for sponsoring my work on Jython. I'd also like to thank the many contributors to Jython, including - but not limited to - bug reports, patches, pull requests, documentation changes, support emails, and fantastic conversation on Freenode at #jython.

Along with language and runtime compatibility with CPython 2.7.0, Jython 2.7 provides substantial
support of the Python ecosystem. This includes built-in support of pip/setuptools (you can use with bin/pip) and a native launcher for Windows (bin/jython.exe), with the implication that you can finally install Jython scripts on Windows.

Jim Baker presented a talk at PyCon 2015 about Jython 2.7, including demos of new features.

Please see the NEWS file for detailed release notes. This release of Jython requires JDK 7 or above.

This release is being hosted at maven central. There are three main distributions. In order of popularity:
To see all of the files available including checksums, go here and navigate to the appropriate distribution and version.

Astro Code SchoolVideo - Implementing Decorators in Python

This screencast provides some insights into implementing decorators in Python using functional programming concepts and demonstrates some instances where decorators can be useful.

In the video, I reference the blog post Python Decorators in 12 Steps by Simeon Franklin for further reading.

Caktus GroupCaktus Wins Two Communicator Awards for PyCon 2015

We’re thrilled to announce that we’ve won two Communicator Awards in this year’s 2015 Communicator Awards competition. With over 6000 entries received from across the US and around the world, the Communicator Awards is considered the largest and most competitive international awards program honoring creative excellence for communications professionals.

Caktus Group was honored with the Gold Award for Excellence in Event Website and Silver Award for Distinction for Visual Appeal for the PyCon 2015 site. Both awards recognize the work of designer Trevor Ray, developers David Ray and Rebecca Muraya, and project manager Ben Riseling.

Of course, we’re excited for our work to be recognized, but these awards also represent an opportunity for PyCon to receive well-deserved recognition, especially for the hard work of the event’s organizers. With the 2015 Communicator Awards, they have been placed in the company of such large brands as the Canadian Olympic Team, Frito-Lay, Lexus, and Red Hat.

You can learn more about the origins of the site’s design and the design process for Trevor’s graphic design by listening to his lightning talk “Reimagining PyCon 2015”.

Caktus GroupAIGA Durham Studio Tour Recap

This was the first year Caktus Group participated in the AIGA studio tour and the turnout was amazing. From 5:30 PM till the 9:00 PM close, we had visitors ranging from students to tenured professionals in the design and web development fields sharing stories and touring the newly renovated Caktus Group office. Members from the Caktus design, development, and management teams were present to field questions, give tours, and show select works from the past year.

From the Epic Allies team, visitors got to see a preview of the app’s mini games and designs. Epic Allies is an app that seeks to gamify the process of taking HIV medication. The goal is to help HIV-positive individuals develop and maintain positive habits around taking their medication and making other healthy life choices. The Epic Allies project has been in progress since 2012 and it’s been great to see it evolve.

Visitors were also able to view and explore the 2015 PyCon website. The design and development of the website were completed by Caktus Group in early 2015. Elements of the design were then used throughout the PyCon conference venue in Montreal. The bright winding forms of the design worked well on screen, but they really enveloped the venue and tied everything together. It was a fantastic project made possible by the hard work of many Caktus staff and the conference organizers Ewa Jodlowska and Diana Clarke, who were great to work with.

Finally, there was a behind-the-scenes video of the Caktus Group reception sign installation and the original install template. The video was shot and edited by Caktus’ Wray Bowling and showed the start to finish process of installing the reception sign that was beautifully crafted by Jim at ArtCraft Sign Company - Thanks, Jim. Having missed the actual installation of the sign, I’m glad Wray captured the process.

By the time 9 PM rolled around, a lot of work was viewed, beers were drunk, and information was shared with new friends. If you didn’t make it out for this year’s AIGA studio tour, don’t be sad. You can still make it out next year. There are a lot of talented people in the Triangle and with so many open studio doors you’re bound to run into more than a few of them.

Caktus GroupMarketplace Radio Highlights How Service Info App Helps 1.5 Million Syrian Refugees

Image Courtesy of UK Department for International Development [CC BY 2.0], via Wikimedia Commons

Recently, one of our projects, Service Info, received national attention thanks to a Marketplace interview. American Public Media’s Kai Ryssdal spoke with International Rescue Committee CEO David Miliband about how Service Info is helping 1.5 million refugees of the Syrian conflict in Lebanon. The Syrian conflict is one of the worst ongoing humanitarian crises, accounting for the majority of the world’s refugees.

“We don’t just need to do more in the Syria crisis, but we’ve got to do things differently,” said Miliband. “The refugees from Syria are educated people, they’re tech savvy people.”

Enter Service Info, a platform developed by Caktus in conjunction with the IRC and the United States government to provide a mobile means for refugees to report on, rate, and find the services available to them. Thus far, displaced persons have been one among millions, adrift without the means to inform themselves or take action in their own self-care. The Service Info platform acts as a reliable source of information, informing individuals as to where they can cash in various vouchers for goods and aid services for instance, or where their children can attend school. More significantly, the platform enables users to comment on these services. Such feedback will in turn improve the quality of service.

“Until now, there’s been no proper tech platform for [refugees] to find out what services are available to them,” said Miliband.

Service Info is revolutionary in providing just such a platform. Once the system has been in use on the ground for a certain length of time, Caktus and the IRC hope to increase the reach of Service Info by open sourcing the app. Making the source code freely available enables others to use, improve upon, and replicate the platform. Agencies working in conflict zones and natural disasters would be able to use it to support displaced persons.

Listen to the complete interview to learn more about the excellent work being done by the International Rescue Committee in supporting the world’s most challenging crises.

Caktus GroupPyCon 2015 Talks: Our Must See Picks (1/6)

Whether you couldn’t make it to PyCon this year, were busy attending one of the other amazing talks, or were simply too enthralled by the always popular “hallway track”, there are bound to be talks you missed out on. Thankfully, the PyCon staff does an amazing job not only organizing the conference for the attendees and the days of the conference, but also by producing recordings of all the talks for anyone who couldn’t attend. Even if you attended, you couldn’t have seen every talk, so these recordings are a great safety net.

Because there are so many of them, I asked those who attended for suggestions. We will share our six favorites, one a week, for the next few weeks. Take some time to watch and learn from these talented speakers from Caktus staff who can’t stop talking about the great time they had in Montreal.

Keynote by Jacob Kaplan-Moss

Suggested by Technical Director Mark Lavin

"Jacob's keynote on Sunday was amazing. He really breaks down the myth of the 10x programmer and why it hurts the tech community. Everyone should watch it. I came away from this talk thinking about how we could improve our hiring and review process to ensure we aren't falling in the traps set by this myth. He's an amazing speaker and leader for our community."

Caktus GroupWhy did Caktus Group start Astro Code School?

Our Astro Code School is now officially accepting applications to its twelve-week Python & Django Web Development class for intermediate programmers! To kick off Astro’s opening, we asked Caktus’ CTO and co-founder Colin Copeland, who recently won a 2015 Triangle Business Journal 40 Under 40 Leadership Award, and Astro’s Director Brian Russell to reflect on the development of Astro as well as the role they see the school playing in the Django community.


Why open the East Coast’s first Django and Python-focused code school?

Colin: Technology is an important part of economic growth in the Triangle area and we wanted to make sure those opportunities reached as many residents as possible. We saw that there were no East Coast formal adult training programs for Django or Python, our specialities. We have experience in this area, having hosted successful Django boot camps and private corporate trainings. Opening a code school was a way to consolidate the training side of Caktus’ business while also giving back to the Triangle-area community by creating a training center to help those looking to learn new skills.

Brian: Ultimately, Caktus noticed a need for developers and the lack of a central place to train them. The web framework Django is written in Python and Python is a great language for beginning coders. Python is the top learning language for the nation’s best universities.Those are the skills prominent here at Caktus. It was an opportunity to train more people and prepare them for the growing technology industry at firms like Caktus.

How has demand for Django-based web applications changed since Caktus first began?

Colin: It has increased significantly. We only do Django development now, we weren’t specialized in that way when we first started. The sheer number of inbound sales requests is much higher than before. More people are aware of Django, conferences are bigger. Most significantly, it has an ever-growing reputation as a more professional, stable, and maintainable framework than other languages.

How does Astro, then, fit into this growth timeline?

Colin: It’s a pretty simple supply and demand ratio. Astro comes out of a desire to add more developers to the field and meet a growing demand for Django coders. The Bureau of Labor Statistics projects a 20% growth in demand for web developers by 2020. It is not practical to wait for today’s college, high school, or even middle-school students to become developers. Many great software developers are adults coming from second or third careers. Our staff certainly reflects this truth. Astro provides one means for talented adults to move into the growing technology industry.

Where do you see Astro fitting in to the local Python and Django community? For instance, how do you envision Astro’s relationship to some of the groups Caktus maintains a strong relationship with, such as Girl Develop It or TriPython?

Colin: Astro’s goals clearly align with those of Girl Develop It in terms of training and support. And the space will be a great place to host events for local groups and classes.

Brian: Yeah, I see it as a very natural fit. We hope to help those organizations by sponsoring meetups, hosting events, and providing free community programs and workshops. And there is the obvious hope that folks from those groups will enroll as students at Astro. I think it’s also important to note that Chris Calloway, one of the lead organizers for TriPython, is a member of the Astro advisory committee. There is a natural friendship with that community.

How do you hope Astro will change and add to Durham’s existing technical community?

Brian: In general there are a lot students with training from Astro who will be able to bring their skills to local businesses, schools, non-profits—all sorts of organizations. For me, computer programming is like reading, writing, and arithmetic: it should be a part of core curriculum for students these days. It helps people improve their own net worth and contribute to the local economy. Astro is all about workforce development and improving technical literacy: two things that help entrepreneurs and entrepreneurial enterprises.

What are some of the main goals for Astro in its first year?

Brian: I want to help people find better, higher paying jobs by obtaining skills that are usable in the current economy through our 12-week classes. I’m personally interested in social economic justice and one way to achieve that is by being highly skilled. Training helps people better themselves no matter what kind of education it is. In the 21st century, computer programming education is one of the most powerful tools for job preparedness and improvement.

Colin: I would love to follow alumni who make it through the classes and see how their skills help them in their careers.

A huge amount of work has gone into getting Astro licensed with the North Carolina Community College Board. A lot of code schools are not licensed. Why was this an important step for Astro?

Brian: Mainly because we wanted to demonstrate to potential students and the public at large that we’ve done our due diligence, that other groups and professionals have vetted us and qualified us as prepared to serve. Ultimately we are licensed in order to protect consumers. Not just licensed—we’re bonded, licensed, and insured. And this is an ongoing guarantee to our students. We will be audited annually for six years. I see it as a promise for continuous and ongoing protection, betterment, and improvement.

So, who would you describe as the ideal student for an Astro course?

Brian: A lot of students. Any. All different kinds. But, more specifically? I would recommend it to folks changing their career. Or people who graduated from high school, but for one reason or another are not able to go onto higher education. Astro classes will be excellent for job preparedness and training so anyone looking to market themselves in the current economy.

Additionally, anyone fine tuning their career after college or even after grad school. Coding and learning to code is an excellent way to earn money to pay for school without getting into debt. Astro is in no way a replacement for higher ed, but coding classes can augment a well-rounded education. Successful people have a diverse education. And learning to code enables people to align their toolkits for the modern job market.


To learn more about Astro, meet Colin and Brian in person, and celebrate the opening of Astro Code School, be sure to stop by the school’s Launch Party on Friday, May 1st from 6:00 to 9:00 pm. Registration is required.

Astro Code SchoolVideo - Functional Programming in Python

In this video our Lead Instructor Caleb Smith presents basic functional programming concepts and how to apply them in Python. Check back later for more screencasts here and on the new Astro YouTube channel.

Astro Code SchoolIntro to Django by PyLadies RDU

PyLadies RDU will be offering a free four hour workshop on Django here at Astro! It'll be taught by Caktus Django developer Rebecca Conley. They'll conduct it here at Astro Code School on Saturday May 30, 2015 from 4pm to 8pm. For more information and to RSVP please join the Pyladies RDU meetup group.

Caktus GroupQ1 2015 Charitable Giving

Though our projects often have us addressing issues around the globe, we like to turn our focus to the local level once a quarter with our charitable giving program. Each quarter we ask our employees to suggest charities and organizations that they are involved in or have had a substantive influence on their lives. It’s our way of supporting not only our own employees, but the wider community in which we live and work. This quarter we are pleased to be sending contributions to the following organizations:

The Scrap Exchange

http://scrapexchange.org
The Scrap Exchange is a nonprofit creative reuse center in Durham, North Carolina whose mission is to promote creativity and environmental awareness. The Scrap Exchange provides a sustainable supply of high-quality, low-cost materials for artists, educators, parents, and other creative people. This is the second time staff nominated this organization.

Durham County Library

http://durhamcountylibrary.org/
The Durham County Library provides extensive library services, including book, DVD, audiobook, and A/V equipment rentals. They also provide computer services, internet access, meeting and study rooms on site, as well as a bookmobile and Older Adult and Shut-In Services for those unable to visit the library. Aside from the library’s service towards the community, their archives were incredibly helpful in the restoration of the building at 108 Morris St where our office is now located. Caktus is particularly thankful for the work of Lynn Richardson, Local History Librarian of the North Carolina Collection, for her invaluable help in the restoration process.

Preservation Durham

http://preservationdurham.org/
Preservation Durham’s mission is to protect Durham’s historic assets through action, advocacy, and education. They provide home tours, walking tours, and virtual tours of Durham. They also advocate for historic places in peril and provide informative workshops for those interested in preserving and restoring historical sites. Their workshops were vital in the restoration of our historic office building in downtown Durham.

Durham Bike Co-Op

http://www.durhambikecoop.org/
The Durham Bike Co-op is an all-volunteer, nonprofit, community bike project whose programming includes hands-on repair skill share, the earn-a-bike program, various mobile bike clinics, and community ride events. They help people build, repair, maintain and learn about bicycles and bicycle commuting. Their community-oriented vision and shared labor practices are definitively Durham.

Diaper Bank of North Carolina

http://ncdiaperbank.org/
Safety net programs such as food stamps and WIC do not cover diapers. And a healthy supply of diapers can fall out of the financial reach of many using these programs. The Diaper Bank of North Carolina provides diapers to families in need. The organization makes it easy to get involved—in fact, Caktus leadership volunteered not too long ago—and it addresses a critical need in the fight against poverty in the Triangle.

Frank WierzbickiJython 2.7 release candidate 3 available!

On behalf of the Jython development team, I'm pleased to announce that the third release candidate of Jython 2.7 is available! I'd like to thank Amobee for sponsoring my work on Jython. I'd also like to thank the many contributors to Jython.

Please see the NEWS file for detailed release notes. This release of Jython requires JDK 7 or above.

This release is being hosted at maven central. There are three main distributions. In order of popularity:
To see all of the files available including checksums, go here and navigate to the appropriate distribution and version.

Caktus GroupCaktus Group's Colin Copeland Recognized Among TBJ’s 40 Under 40

Caktus co-founder and Chief Technology Officer, Colin Copeland, is among an outstanding group of top business leaders to receive the Triangle Business Journal’s 2015 40 Under 40 Leadership Award. The award recognizes individuals for their remarkable contributions to their organizations and to the community.

Colin was one of the co-founders of Caktus, started in 2007 around a second-hand Chapel Hill dining room table. Now, Caktus is the nation’s largest custom web and mobile software firm specializing in Django, an open source web framework. Caktus has built over 100 solutions that have reached more than 4 million lives. Clutch.io, a research firm, lists Caktus as one of the nation’s top web development firms. As a direct result of Colin’s guidance and vision, Caktus has built technology that not only helps business clients, but has addressed some of the most difficult global challenges facing us today: humanitarian aid for war refugees, HIV/AIDS, and open access to democracy, among others.

Colin also served as UNICEF’s community coordinator for RapidSMS, a platform to build technology for developing nations quickly and freely. He used his experience as part of the Django open source community to lay the foundations of a global network of developers working towards improving the world. RapidSMS projects, featured on the BBC, Time Magazine, Fast Company, and others, have reached untold millions in the effort to improve daily lives.

Colin, a Durham resident, is passionate about improving his local community. He used his community-building skills and keen technical expertise to found Code for Durham, a volunteer group dedicated to improving civic engagement by building free technology tools. The group includes software developers, designers, civic activists, policy experts, and government employees. Colin, along with key Code for Durham members, successfully lobbied for increased Durham government transparency via a new Open Data Manager position. The group is working on web applications to help with school navigation, homelessness, bike crash locations, and more.

In keeping with the spirit of supporting his local Durham community, Colin led the historic restoration of Caktus’ new headquarters in downtown Durham. He ensured renovations included a community meeting space that could support local technology groups such as TriPython, Girl Develop It RDU, and PyLadies RDU. He is also a member of Durham’s Rotary Club.

A strong advocate for the power of technology to change lives, Colin led the founding of Caktus’ Astro Code School. Astro provides full-time software development education for adults in an inclusive environment, and will increase access to the Triangle’s growing technology industry.

Colin will be honored at the 40 Under 40 Leadership Awards Gala on June 11th at the Cary Prestonwood Country Club. The Triangle Business Journal will also profile him in a special section of their June 12th print edition.

Caktus GroupPyCon 2015 ReCap

The best part of PyCon? Definitely the people. This is my fifth PyCon, so I’ve had a chance to see the event evolve, especially with the fantastic leadership of Ewa Jodlowska and Diana Clarke. We were also lucky enough to work with them on the PyCon 2015 website. This year we were once again located in the Centre-Ville section of Montreal, close to lots of great restaurants and entertainment.

Mark Lavin, David Ray, and Caleb Smith arrived before the official start of the conference to host a workshop on “Building SMS Applications with Django.” As avid users of RapidSMS for many of our of projects, including UNICEF’s Project Mwana and the world’s first SMS voter registration app for Libya, it was a great experience to share our knowledge.

We also had a chance to work with future Django developers through the DjangoGirls Workshop this year. Karen Tracey, David Ray, and Mark Lavin served as mentors to help the mentees build their first Django app. It was wonderful to watch new programmers develop their first apps and we are looking forward to participating in similar events in the future.

The conference kicked off Thursday night with a reception where we debuted a game we built during one of our ShipIt Days. Our Caktus-designed “Ultimate Tic Tac Toe” was a huge hit!

Also on Thursday, the O'Reilly booth held a book signing for Mark Lavin’s Lightweight Django that he coauthored with Julia Elman. An impressively long line of people showed up for the event. Luckily, Mark’s around the office enough that we can get him to sign all sorts of books for us.

Look at all those people!

Friday and Saturday the trade booth show was in full swing. At the Caktus booth, people continued to line up to play “Ultimate Tic Tac Toe” and we gave away five copies of Mark’s book, Lightweight Django, as well as three quadcopters. We were sad to see the quadcopters leave the office but hope that the new recipients enjoy playing with them as much as we did.

We also had some visits from our PyCon 2015 ticket giveaway winners. We gave tickets to the Python community at large and to our local community groups here in North Carolina, including TriPython, Girl Develop It RDU, and PyLadies RDU.

Duckling, an app we developed to make it easier to find and join casual outings at conferences, was also in full use this year at PyCon. We brought along the app’s mascot Quacktus. He even had his own Twitter handle this year to give a bird’s eye view of PyCon happenings. It was great to once again use the app to meet new people and catch up with old friends while exploring Montreal.

On the last night of PyCon, PyLadies held their charity auction and Caktus donated a framed collage of Trevor Ray’s preliminary artwork and sketches that went into his redesign of the PyCon 2015 website. We were very honored that it sold for $1,000 (the second highest bidded item, second only to Disney’s artwork) and are glad we can provide support to all of the awesome work PyLadies does for the community.

PyCon was, as always, a terrific time for us and we can’t wait until 2016. See you in Portland!

Footnotes