A planet of blogs from our members...

Caktus GroupQuery Expressions are Amazing

The Django 1.8 release added support for complex query expressions. The documentation has some nice examples but they don't do justice to how crazy awesome these are. In this post, we will go through some additional examples of how to leverage these expressions.

Django has had one form of a query expression for several years now: the F expression. F() can be used to reference an existing column in a query. This is often used for atomic update operations, such as incrementing a counter. However, F can also be used to compare two columns on a model when filtering. For instance, we may be interested in users who haven't logged in since their first two weeks on the site. That requires comparing the value of the last_login column and the date_joined on the standard User model from contrib.auth:

from datetime import timedelta

from django.contrib.auth.models import User
from django.db.models import F
from django.utils.timezone import now

# Create some fake data: 10 active users and 20 inactive ones
today = now()
active_count = 10
inactive_count = 20
for i in range(1, active_count + inactive_count + 1):
    active = i <= active_count
    prefix = 'in' if active else ''
    domain = 'example.com' if i % 3 == 0 else 'caktusgroup.com'
    attributes = {
        'username': '{}active-{}'.format(prefix, i),
        'email': '{}active-{}@{}'.format(prefix, i, domain),
        'date_joined': today - timedelta(days=30),
        'last_login': today - timedelta(days=0 if active else 21),
    }
    User.objects.create(**attributes)
# Query inactive users
inactive = User.objects.filter(last_login__lte=F('date_joined') + timedelta(days=14))

The F expression supports basic arithmetic operations including some date math, as seen in the example above. However, it is still very limiting in comparison to what is available in SQL.

Relational databases such as Postgresql support a number of built-in functions which you can leverage in the ORM using the Func expression added in 1.8. For example, you may want to examine the email domains of your user base. For that, you might use the split_part function in Postgresql to extract the domain of the email address column. To normalize the domain values you can compose this with the built-in Lower expression:

# Continued from above
from django.contrib.auth.models import User
from django.db.models import F, Func, Value
from django.db.models.functions import Lower

qs = User.objects.annotate(domain=Lower(
    Func(F('email'), Value('@'), Value(2), function='split_part')))

This translates into the SQL call split_part("auth_user"."email", @, 2) and annotates every user with a new domain attribute which is the domain of their email address. The value 2 passed to split_part says to take the second value after splitting the string. Unlike Python this is a 1-based index rather than a 0-based index. With this we can find out what the most popular domains are for the users:

# Continued from above
from django.db.models import Count

popular = qs.values('domain').annotate(count=Count('id')).order_by('-count')
print(popular)
# Result
# [{'count': 20, 'domain': 'caktusgroup.com'},
# {'count': 10, 'domain': 'example.com'}]

As noted in the example, this returns a list of dictionaries of the form {'domain': <domain name>, 'count': #} ordered by the highest counts first. We can take this even further using the conditional expressions.

Two more new expressions Case and When can be used to build conditional aggregates. For instance, we may want to only count users who have logged in recently:

# Continued from above
from django.db.models import Case, When
from django.utils.timezone import now

active = When(
    last_login__isnull=False,
    last_login__gte=now() - timedelta(days=14),
    then=Value(1))

active defines the conditional expression when the last_login is not null and is a date later than 14 days ago. If there is a match then this row will add the value of 1 to the aggregate. This conditional expression can be passed into an aggregate expression such as Count, Sum, or Avg. To get the popular domains, we’ll count the number of active users for a given email domain.

# Continued from above
popular = qs.values('domain').annotate(
    count=Count('id'), active_count=Count(Case(active))).order_by('-active_count')
print(popular)
# Result
# [{'active_count': 7, 'count': 20, 'domain': 'caktusgroup.com'},
#  {'active_count': 3, 'count': 10, 'domain': 'example.com'}]

This adds a new key/value to the resulting dictionaries which include the number of active users for the domain. Here caktusgroup.com has the most active registered users but it also has the most registered users overall. For one last usage, we can look at the percent of users for each domain who are active again using the F expression:

# Continued from above
popular = popular.annotate(
    percent_active=Value(1.0) * F('active_count') / F('count') * Value(100)
).order_by('-percent_active')
print(popular)
# Result
# [{'active_count': 7, 'count': 20, 'domain': 'caktusgroup.com', 'percent_active': 35},
#  {'active_count': 3, 'count': 10, 'domain': 'example.com', 'percent_active': 30}]

Again this adds another data point to the returned list of dictionaries which is the percent of active users. Now we know which email domains are associated with the most users, the most recently logged in users, and the percent of users with that domain who have been recently active.

Query expressions like Func allow you to make more complex queries, leveraging more of your chosen database’s power without having to drop to raw SQL. Combined with the aggregation and conditional expressions you can roll up additional statistics about your data set using the expressive power of the ORM. I hope these examples give a good overview of some of the queries that are now easy to handle in the ORM and which previously required raw SQL.

Caktus GroupThe Journal of Medical Internet Research Features Epic Allies Phase 1 Study Results

The Journal of Medical Internet Research recently published “Epic Allies: Development of a Gaming App to Improve Antiretroviral Therapy Adherence Among Young HIV-Positive Men Who Have Sex With Men”. Epic Allies, initially funded by a federal Small Business Innovation Research (SBIR) grant, represents a partnership between Caktus, UNC’s Institute of Global Health and Infection Diseases, and Duke Global Health Institute.

The article highlights the challenges of medication adherence, emphasizing the concerns of study participants directly:

“Yeah, cause honestly, it was a good few months before I ever took medication. And in that timeframe of diagnosis to taking medication, it was very easy for me to detach. It was very easy for me to say, this is not real, nahhh, whatever. It didn’t become real until I had to take a pill. When you take a pill, it’s real.” - Study participant.

The team used continuous participant feedback to iteratively develop the application. Ultimately, the study found that this iterative approach to application development was what made it “highly acceptable, relevant, and useful by YMSM (young men who have sex with men).”

The study authors are Sara LeGrand, PhD; Kathryn Elizabeth Muessig, PhD; Tobias McNulty, BA (Caktus); Karina Soni, BA; Kelly Knudtson, MPH; Alex Lemann, MS (Caktus); Nkechinyere Nwoko, BA (Caktus); and Lisa B Hightow-Weidman, MPH, MD.

To read the study in full, visit http://games.jmir.org/2016/1/e6/.

Edited to add: Epic Allies was built with Unity and the Python backend is built on top of Django and Django REST Framework.

Caktus GroupPyCon 2016 Recap

PyCon, beyond being the best community event for Python developers, is also an event that we happily began thinking about eleven months ago. Almost as soon as PyCon 2015 ended, we had the good fortune of planning the look and feel of PyCon 2016 with organizer extraordinaires Ewa Jodlowska, Diana Clark, and new this year, Brandon Rhodes. Our team has loved working with the organizers on the PyCon websites for the past three years now. They’re great people who always prioritize the needs of PyCon attendees, whether that’s babysitting services or a smooth PyCon web experience.

Seeing the PyCon 2016 Artwork

The Caktus team arrived in Portland and were almost immediately greeted with large-scale versions of the artwork our team made for PyCon. Seeing it on arrival, throughout the event, and especially during the keynotes was surreal.

PyCon 2016 sponsor banner

Getting ready for the tradeshow

Our team got ready for the booth first, ahead of the PyCon Education Summit and Sponsor Workshops where we had team members speaking. Here’s the booth before everyone came to grab t-shirts and PyCon tattoos and to learn more about us.

The Caktus booth at PyCon before the festivities begin.

Here’s a closeup of our live RapidPro dashboard too.

The RapidPro live dashboard Caktus built for PyCon.

Supporting our team members

This year, at the PyCon Education Summit, Rebecca Conley spoke about expanding diversity in tech by increasing early access to coding education. Erin Mullaney and Rebecca Muraya spoke at a Sponsor Workshop on RapidPro, UNICEF’s SMS application platform. Sadly, we didn’t get a picture of Rebecca C, but Erin shared this picture of herself and Rebecca M. on Twitter.

Erin and Rebecca M. after giving their RapidPro talk at PyCon.

Tradeshow time!

PyCon, for our booth team, is always intense. Here’s a taste of the crowds across three days.

A busy crowd around the Caktus booth.

The excitement, of course, included a giveaway. Here’s the winner of our BB8 Sphero Ball raffle prize, Adam Porad of MetaBrite, with our Sales Director, Julie White:

PyCon attendee wins the Caktus BB8 Sphero giveaway.

So many talks

With our office almost empty and most of our team at PyCon, there were a lot of talks we went to, too many to list here (don’t worry, we’re going to start highlighting the talks in our annual PyCon Must See Series). We do want to highlight one of the best things about the talks— the representation of women, as described by the PyCon Diversity chair:

Across three packed days, here’s some of the topics we got to learn more about: real time train detection, inclusivity in the tech community, and better testing with less code. With the videos now available, we can still catch all the great talks even if we couldn’t be there.

PyLadies auction

One of the highlights of PyCon is definitely the PyLadies auction. Every year, it’s a raucous event that’s just plain fun. This year, we contributed original concept art for the PyCon 2016 logo. It went for $650 to Jacob Kaplan-Moss, the co-creator of Django. Since we’re a Django shop, there definitely was quite a bit of excited fandom for us.

Jacob Kaplan-Moss holds won auction item: Caktus' early concept art for PyCon 2016 logo

And we can’t leave without a cookie selfie

Whoever came up with the cookie selfie idea is brilliant. Here’s Technical Director Mark Lavin with his cookie selfie.

Hope to see you next year!

In the meantime, make sure to return to our blog for our annual PyCon Must See Series.

Caktus GroupMy First Conference Talk: Reflecting on Support and Inclusivity at DjangoCon Europe 2016

The environment at Caktus is, above all, one of encouragement. I experienced that encouragement as an intern and continue to experience it as a full-time developer. In addition to providing workplace mentorship, Caktus encourages all of its employees to submit talks to conferences. My manager Mark Lavin and mentor Karen Tracy encouraged me to get over my concerns about being new to the field and to start submitting talks.

Along with the support from Caktus was the impetus from Djangocon Europe for first-time speakers to submit. Djangocon Europe’s Call For Papers(CFP) includes suggested topics and offers of support beginning with brainstorming a topic and including mentors if your talk is chosen. I took them up on this offer and floated a couple of ideas over email. I got a very quick response with the suggestion that I expand a previous blog post I had written on my mid-career transition into a talk. Baptiste Mispelon and Xavier Dutreilh continued to be helpful and responsive throughout the application process and made me feel like my contribution was valued and that I was being taken seriously, whether my talk would be ultimately selected or not.

A week later, I received the notification that my talk was selected. The support continued from Caktus, the broader local development community, and the Djangocon Europe organizers. Mark helped me refine my talk content, and Caktus coworkers and Pyladies helped me organize public previews of the talk. Djangocon Europe opened a slack mentor channel in which I was able to ask a lot of questions about talks in general and about how to communicate effectively with an international audience. The refinement and confidence gained from these experiences helped send me to Europe excited about giving my first talk.

The organizers made travel easy, opening a Slack channel for attendees to ask general questions about the conference. I arrived in Budapest, got a shuttle to the hotel, and checked into my room. Then I got on the Slack #speakers channel and asked if anyone wanted to join me for dinner. I ended up with two fantastic dinner companions, Andrew Godwin and Anna Schneider. Over dinner I learned about developing for non-profits, London’s economic development, and many other fascinating things.

Budapest is beautiful and extremely friendly and walkable. In general, google maps and google translate worked to help me get around after initially leading me astray on my first walk to the venue. Once I arrived, I was greeted with signs telling me I was welcome, I looked awesome, what the code of conduct was, and what phone numbers and email to use if I had any concerns. The conference was well staffed with friendly folks to direct attendees and to answer questions. The food and snacks were good. There were dedicated quiet spaces and a separate prayer room. Attention to all of these details showed that the conference organizers carefully considered the needs and comfort of all the attendees in their planning and made us feel valued.

Throughout the conference and afterwards, Djangocon Europe showed particular dedication to the Code of Conduct and the principles behind it, which generally amount to “be kind to others.” All conference slides were reviewed by the organizers to make sure they adhered to the Code of Conduct. Light-hearted but direct signs in the bathrooms made it clear that gender non-conforming attendees were welcome and that their safety and comfort were important. During the conference, an announcement was made regarding a slide that had been added after the screening and brought to the attention of the organizers as a violation to the Code of Conduct. This announcement demonstrated that complaints were taken seriously and handled quickly. It served to make us all feel that our safety and comfort was a priority. I even saw an interaction on Slack that enforced these values of inclusion and kindness. The conversation started with er someone giving a lightning talk asking an organizer to screen his slides The organizer pointed out a slide with a photo that could be seen as objectifying the woman in the photo. The speaker agreed and removed the slide. It was a simple interaction from which everyone learned. As a female speaker, I felt that the organizer was absolutely looking after both my interests and the interests of the presenter. Soon after the conference, Djangocon Europe published a Transparency Report detailing protections put in place, issues that arose, and the way those issues were handled. No conference can completely control attendee behavior, but attentiveness and transparency like this should set the standard for how conferences create safe and inclusive environments.

DjangoCon Europe 2016 venue

The venue was very attractive and comfortable, with a small theater for the talks as well as a library and balcony where talks were streamed for those who wanted a smaller, quieter setting. Having those options definitely helped me enjoy the conference as I had speech preparation in mind, along with getting the most I could from the talks.

The first talks emphasized the tone of welcoming and mutual respect. In their talk, “Healthy Minds in a Healthy Community” Erik Romijn & Mikey Ariel spoke frankly and personally about the struggles many of us face in the open source community to maintain physical and mental health while facing the demands of our jobs as well as the added desire to contribute to open source projects. As a new developer, it was really important for me to hear that all the people I perceive as “rockstars” and “ninjas” are just as human as I am and that we all need to take care of each other. It also inspired me to reflect on my gratitude that I work at Caktus, where we are all valued as people and our health and happiness is a priority of the company.

The talks were all fantastic, a nice blend of topics from debugging to migrations to the challenge and necessity of including languages that don’t read left-to-right, given by women and men from all over the world. I felt honored to be among them and pleased that the organizers felt a mid-career transition into programming merited a speaking slot. The whole experience continued to be enjoyable, especially the speaker dinner consisting of traditional Hungarian food. At the dinner I had the chance to learn about the developing tech scene in Eastern Europe and the assumptions we had about each other on either side of the Iron Curtain in the mid 1980’s. Software itself is impressive. However, it is only when we get to understand the people who are making it and the people for whom we are making it that software’s real meaning and value become evident. Djangocon Europe definitely facilitated that kind of understanding. Another highlight of the evening was receiving my speaker gift, some souvenirs from Budapest and a handwritten note thanking me for participating, which made me feel very appreciated.

Before a talk.

My talk was on the last morning, and while I expected everyone to be tired from the party the night before with live music in one of Budapest’s “ruin pubs,” there was a good crowd. The emcee Tomasz Paczkowski did a great job preparing speakers, including me before we spoke enforcing the “no comments, only questions” policy after we finished speaking. Speakers were also given the option to have no questions from the stage. I didn’t choose that, but I see how that option would be valuable to some speakers.

What I didn’t know when I first submitted my talk was that it was a single-track conference. I learned that when I saw the schedule. My audience was the whole conference, as it was for all the speakers. It was daunting at first to know that all eyes would be on me (at least the eyes of everyone who chose to attend a talk). I went into the room the night before and stood on stage, looking at several hundred empty chairs and absorbing the idea that they would be full of people watching me the next day. Fortunately I now knew who at least some of these people were, and I had seen in general how they responded positively to each other and to other speakers. I have performed as a dancer in front of large crowds plenty of times, but had never given a professional talk to an audience of that size. The beauty of the space and the familiarity of being on stage certainly helped ease my apprehension.

By the last day of the conference, I felt so comfortable and appreciated that I enjoyed giving my talk, From Intern to Professional Developer: Advice on a Mid-Career Pivot immensely. I was a little bit nervous, but just enough for it to be motivating. A number of people made a point of encouraging me throughout the week and being present as smiling faces close to the front during my talk. It went by quickly. I tried to remember to breathe and look up (both things I forget to do when I’m nervous). The crowd was polite and responsive.I got some good questions and follow-up from people who had made a similar transition or were thinking about it, as well as questions from some hiring managers. I felt like I was able to make a valuable contribution to the conference and to the community through my talk, and I am grateful to Djangocon Europe and Caktus for making it all possible.

Caktus GroupCode for Durham: National Day of Civic Hacking Recap

Code for Durham recently participated in Code for America’s National Day of Civic Hacking. Hosted in the Caktus Group Tech Space, the event was attended by more than 50 local participants.

Community manager of opensource.com, Jason Hibbetts acted as emcee for the day. He introduced the two planned projects from Code for Durham and fielded project pitches from other attendees.

During the day, participants broke into teams to work on a total of five projects. These included Code for Durham’s School Navigator—a platform for geolocating nearby publicly-funded schools and accessing information on those schools—and CityGram—a notifications platform for local issues like crime data reporting or building permit changes.

School Navigator team

Caktus’ Chief Business Development Officer Alex Lemann helped coordinate the team working on Durham’s School Navigator. The group consisted of thirteen developers, two UX designers, and Councilman Steve Schewel, who came to learn about the project. Code for Durham members were able to onboard all of the day’s participants to the project, add them as collaborators to the GitHub repo, and introduce them to the project’s backlog. Team members then selected their own tasks from the backlog to tackle.

A number of updates were made to the school navigator, including changes to frontend code, revisions to school policy information, deployments to help keep the site running smoothly, and an entire redesign of the school profile pages from a UX perspective.

School Navigator Project Intro

Alex was especially excited by his contribution to the day’s work. Noticing that GitHub only measures collaboration on a project in terms of committed code, Alex created a way of honoring alternative contributions. To accomplish this, he developed contributors.py, which uses the GitHub API to look up every comment added to a particular project. This data is then compiled into a contributors list, making recognition of contributions to a project more transparent and inclusive.

Ultimately, the day was a success. “Participants were enthusiastic and made significant contributions to the projects,” Alex commented. “It is important to contribute to open source projects and give back to the technical community. But it is additionally rewarding to contribute to projects you know are helping people nearby in your very own neighborhood.”

Edit your city

Though the National Day of Civic Hacking is over, work on these projects is ongoing. To get involved, check Code for Durham’s list of upcoming events and be sure to attend one of their civic hack nights.

Not local? According to U.S. Chief Technology Officer Megan Smith, more than 100 civic hacking events were held nationwide. To learn how to get involved in your area, visit codeforamerica.org.

Jeff TrawickA few quick notes on trying out Windows 10 WSL (Windows Subsystem for Linux)


I had a chance to play with the Ubuntu userspace support in Windows 10 recently. I started with Windows Insider Preview build 14295 from MSDN, enabled the "Fast Ring" for Windows Insider updates, and then updated to build 14316 and activated the subsystem. After rebooting, running bash from PowerShell installed the Ubuntu base image.

The first use case was building the APR 1.6.x tree from svn and running the test suite. The build was uneventful and most testcases pass. From looking only at the test suite output, it seems that Sys V semaphores and the FREEBIND socket option aren't implemented, epoll isn't close enough to Linux, sendfile isn't close enough to Linux, and perhaps something about file locking isn't close enough. That's not so bad all in all. I think the next steps here are to identify some particular discrepancy from Linux, report it, and see if they bite. Separately, setting some ac_cv_FOO variables might be close to enough to get the testcases to run completely (e.g., use poll() instead of epoll(), use a different default proc mutex, etc.).

The second use case was trying out a Python+Django+PostgreSQL project, checking if it is usable and test suites for some projects I work on pass. (Such code "should" work fine on native Windows given enough time to mess around with installing different server software, getting native Python extensions to compile, etc. Yuck.) Unfortunately, the lack of SysV IPC breaks PostgreSQL setup. (See this GitHub issue for the PostgreSQL failure and this suggestion for supporting SysV IPC.)

So what is the point of WSL, for me at least?

  • Garbage collect the last 3 or so ways to get Unix-y stuff running on Windows and make the installation and update effort for such tools largely disappear (apt-get). (I.e., greatly improve the theoretical best-of-both-worlds development setup.
  • Improve some use cases for the occasional savvy Windows-bound consumer of <dayjob>, which would require being able to install a Django application using the same Ubuntu and Python packages and the same setup instructions as on Ubuntu, in order to run management commands and maybe occasionally use the dev server for testing very simple changes.
  • Have something new for me to yell at OS X users with broken tools who have been ignoring my pleas to install an Ubuntu VM.

References

Caktus GroupWhat We’re Clicking - May Link Roundup

Below you can find this month’s roundup of articles and posts shared by Cakti that drew the most attention on Twitter. The list covers coding for matrix factorization algorithms in Python, designing apps that optimize for sequential dual screen usage, preventing technical debt, and understanding the complexities and limitations involved in building apps for low-income American families.

Finding Similar Music Using Matrix Factorization

A step-by-step guide to calculating related music artists with matrix factorization algorithms. This tutorial is written in Python using Pandas and SciPy for calculations and D3.js for interactive data visualization.

Windows on the Web

Completing a task across multiple devices is a common habit. And yet, according to the author of this piece Karen McGrane, this practice is rarely considered in user design scenarios. In this article, McGrane contemplates how to design the best user experience for sequential dual screen usage.

Technical Debt 101

This article is a detailed explanation of technical debt and the negative consequences of sacrificing code quality.

What I Learned from Building an App for Low-Income Americans

Ciara Byrne’s thoughtful article on lessons learned from her experience building an app for low-income Americans. Byrne reflects not only on the challenges involved in designing for this particular community of users but also the complex definitions of low-income that must be taken into account when approaching similar projects.

Caktus GroupCode for Durham and a National Day of Civic Hacking

This Saturday, June 4th, Caktus Group will be hosting Code for Durham as they join Code for America’s National Day of Civic Hacking. The day is a chance for everyone from developers, to government employees, to residents who care about their city to come together and use their talents to help the community. Attendees will collaborate on civic tech projects to be used by citizens and government employees. These projects seek to provide data on or improve government processes, addressing issues like health care, affordable housing, criminal record access, police data, and more.

The Code for Durham event will support work on several ongoing projects. These include CityGram and the Durham School Navigator. CityGram is a notifications platform for local issues like crime data reporting or building permit changes. Durham School Navigator enables users to geolocate nearby publicly-funded schools and view information like performance ratings, filter by magnet, charter, or public schools, and demystify school zoning patterns. Aside from these two projects, there will also be a period for attendees to pitch new project ideas.

The day will be filled with opportunities to contribute to projects that make Durham better. See the rest of the day’s schedule below and register for the event here.


National Day of Civic Hacking - Saturday, June 4th

Kickoff & Project Sprint Pitches (10:00 am - 10:45 am)

Hear from the two existing, documented projects (Citygram and Durham School Navigator). Open floor for other project ideas.

Sprints (11:00 am - 3:30 pm)

Break out into work groups for the various projects.

Civic Hacking 101 (11:00 am - 12pm)

Open meeting with Red Hat’s Jason Hibbetts, community manager of opensource.com, and City of Durham’s Laura Beidiger. They will provide background on Code for America and demo some of the local civic apps Code for Durham has built. Finally, they will facilitate signing up to participate in a community app user testing group.

Lunch (12:00 pm)

Catered lunch from The Farmery in Durham ($5.00)

Read Out (3:30 pm - 4:00 pm)

Sprint teams demo their projects and report on progress from the day’s work.

Caktus GroupWhere to Find Cakti at PyCon 2016

As Django developers, we always look forward to PyCon. This year, working with the Python Software Foundation on the design for PyCon 2016’s site kindled our enthusiasm early. Our team is so excited for all the fun to begin. With an array of fantastic events, speakers, and workshops, we thought we would highlight all the events we’ll be participating in. Come find us!

Sunday, May 29th

Python Education Summit: Outside the Pipeline: Expanding Early Access to Coding as a Career Choice (3:10 pm)

Rebecca Conley will be speaking about how to increase diversity in tech by expanding early access to coding education.

Sponsor Workshop: Leveraging Text Messaging in 2016 with RapidPro (3:30 pm)

Attend Erin Mullaney and Rebecca Muraya’s workshop on building SMS surveys. The workshop will include an overview of up-to-date case studies involving the use of SMS for surveys, crises, elections, and data tracking. Erin and Rebecca will also cover RapidPro, UNICEF’s open source SMS mobile messaging platform. In addition to basic functionality, they will demonstrate how to extend RapidPro’s core functionality through use of the API and how to manage SMS security.

Monday, May 30th - Tuesday, May 31st

Trade Show (8:00 am - 5:00 pm)

Don’t forget to stop by our trade show booth where you can take our 5-question project health quiz and chat about results with our experts. You can also sign up for one of our limited time slots to discuss upcoming projects.

Our booth, double in size this year, will also feature a live RapidPro survey about PyCon attendees. We’ll also have some sweet swag, like PyCon 2016 temporary tattoos, Django-themed t-shirts, and more. Plus, you can enter to win an authentic BB8 Sphero!

pycon 2016 temporary tattoos

Mon, May 30th - Wednesday, June 1st

Open Spaces

We’re planning on hosting a few Open Spaces discussions. Times TBD. Be sure to look for our discussion topics on the Open Spaces board!

  • Scrum at Caktus
  • RapidPro Flow Editor overview
  • Using the Django project template
  • Open data policing and getting involved in civic tech
  • RapidPro deployment and usage
  • AWS deployment using Python
  • Python in civic tech and Code for America projects
  • Building APIs
  • Teaching community-centered Python classes
  • Python meetup groups - supporting PyLadies and Girl Develop It

Tuesday, May 31st

Charity Fun Run (6:00 am)

Mark Lavin will be running in the annual 5k charity fun run. This year’s funds will be donated to Big Brothers Big Sisters Columbia Northwest.

PyLadies Auction (6:30 pm)

There are always fun and fantastic items up for bid during the PyLadies benefit auction. This year, Caktus is contributing a framed piece showcasing early concept sketches of the PyCon 2016 website.

early designs for PyCon 2016

Wednesday, June 1st

Job Fair (10:00 am - 1:00 pm)

Find out how you can grow with us! Stop by our booth at the Job Fair for information on our open positions. We’re currently looking for sharp Django web developers to fill full-time and contractor positions.

Caktus GroupMark Lavin to Give Keynote at Python Nordeste

Mark Lavin will be giving the keynote address at Python Nordeste this year. Python Nordeste is the largest gathering of the Northeast Python community, which takes place annually in cities of northeastern Brazil. This year’s conference will be held in Teresina, the capital of the Brazilian state of Piauí.

Mark will be speaking from his love of long-distance running . Using endurance sports training methodologies and applying them to development training, he will provide a roadmap for how to improve as a developer throughout your career.

Mark Lavin is a co-author of Lightweight Django from O'Reilly, The book was recently published in Portuguese and is available in Brazil under the title Django Essencial from publisher Novatec. He has also recorded a video series called "Intermediate Django" which focuses on integration background jobs with Celery and best practices for growing Django developers. Mark is an active member of the Django community and you can often find him contributing to the Django Project or answering questions on StackOverflow.

Tim HopperInstall Apache Storm with Conda

I'm looking into using Apache Storm for a project, and I've been fiddling with several different versions in my local testing environment.

I made this easier for myself by adding binaries for Storm 0.10.1 and Storm 1.0.1 to my Anaconda.org channel. That means you can add the Storm binary to your path with

conda install -c tdhopper apache-storm=1.0.1

or

conda install -c tdhopper apache-storm=0.10.1

Caktus GroupCaktus CTO Colin Copeland Invited to the White House Open Police Data Initiative

We at Caktus were incredibly proud when the White House Police Data Initiative invited CTO Colin Copeland to celebrate their first year accomplishments. While at the White House, Colin also joined private breakout sessions to share ideas with law enforcement officials, city staff, and other civic technologists from across the country. Colin is the co-founder of Code for Durham and served as lead developer for OpenDataPolicingNC.com. OpenDataPolicingNC.com, a site built for the Southern Coalition for Social Justice, displays North Carolina police stop data.

When he returned, we couldn’t wait to learn more about his perspective on the talks given (video available) and the breakout discussions held. Here’s a condensed version of our conversation with him.

Can you tell us what the White House Police Data Initiative is?

It’s an effort by the Obama administration to use open data and partner with technologists to strengthen the relationship with citizens and police. The goal is to increase transparency and, as a result, build trust and accountability. It has grown a lot—53 law enforcement agencies. It’s an incredible initiative.

What was it like to be at the White House Police Data Initiative celebration?

It was super exciting to be at the White House and to see demonstrations of what has been accomplished. Fifty-three new law enforcement agencies had signed on to the initiative with close to 100 datasets. I felt lucky to be part of it since what we do with OpenDataPolicingNC.com is such a small piece of the whole effort.

Seeing other initiatives and what other police departments are doing was invigorating. It really made me feel motivated to keep doing what we’ve been doing, especially after seeing other Code for America projects. I also liked being able to hear the perspectives of people from vastly different backgrounds, whether it was someone in the police department or the city. The event was about learning from people all over the country.

Can you describe a couple perspectives you found interesting?

Ron Davis [Director of the federal Office of Community Oriented Policing Services] had a perspective grounded in making sure all police officers, from the rank and file to leadership, understood why open data benefits them. It’s an important aspect. If they don’t see the benefit and buy-in, it’s much harder to advocate for open data.

Also Chief Brown [Dallas] emphasized that releasing data led to community trust. The value you get out of that outweighs any internal pressure not to do it. He was straightforward about how doable it is to release data, how good it was for Dallas, and encouraged other police departments to do the same.

What do you think is the greatest challenge to open data policing efforts for interested agencies?

Knowing what to share is a hurdle for most new agencies. There was some discussion of building a guide or toolkit to share ways to implement this in your city. Small police agencies do not want to reinvent the wheel, so they need easier onboarding. We need to make it easier for everyone to get involved.

What was something new you learned about open data policing?

I learned a lot. It was a lot of interesting, new perspectives, innovative partnerships. But there was one aspect: there’s not a lot of data standards for how police track and report various metrics, including use-of-force. So you can’t compare one jurisdiction to another always. It can look bad for one department versus another because you used a different set of criteria. There needs to be greater standards in order to better share information.

What’s next for you with open data policing?

There’s going to be an expansion of OpenDataPolicingNC.com and that’s through Code for Durham. We’re going to be using geolocational data provided by Fayetteville and Police Chief Harold Medlock. He asked us to map the data to see what it highlights. We hope other agencies can use it, too, once the Fayetteville one is online. It’s an exciting project and we’re honored Chief Medlock asked us to help out.

Colin Copeland at the White House Police Data Initiative. Also pictured, representatives from other open police data initiatives.

Caktus GroupWhat We’re Clicking - April Link Roundup

It's time for this month’s roundup of articles and posts shared by Cakti that drew the most attention on Twitter. The list highlights new work in civic tech and international development as well as reasons for the increasing popularity of Python and open source development.

Python is an Equal Opportunity Programming Language

An interview with David Stewart, manager in the Intel Data Center Software Technology group, about the unique accessibility of the Python programming language as well as the inclusivity of its community.

Why Every Developer is an Open Source Developer Now

A short article on why the future of IT lies in open source collaboration.

A Debate Where the Voters Pick the Questions

The Atlantic’s profile of the Florida Open Debate platform. Caktus Group helped build the tool on behalf of the Open Debate Coalition. The platform powered the first-ever crowd sourced open Senate debate.

Making it Easy to Bring Cellphone Apps to Africa

A wonderful Fast Company's profile on Africa’s Talking, a startup devoted to making it easier for developers to disseminate SMS-based apps to cell phone users in Africa.

Caktus GroupFlorida Open Debate Platform Receives National Attention (The Atlantic, USA Today, Engadget)

Several national publications have featured the Florida Open Debate platform, including USA Today, Engadget, and The Atlantic. Caktus helped develop the Django-based platform on behalf of the Open Debate Coalition (ODC) in advance of the nation’s first-ever open Senate debate held in Florida on April 25th. The site enabled citizens to submit debate questions as well as vote on which questions mattered most to them. Moderators then used the thirty most popular questions from the site to structure the debate between Florida Senate candidates David Jolly (R) and Alan Grayson (D). According to The Atlantic, more than 400,000 votes were submitted by users on the site, including more than 84,000 from Florida voters.

Florida Open Debate user-submitted questions

“Normally, the press frames important US election debates by choosing the questions and controlling the video broadcast,” wrote Steve Dent. “For the first time, however, the public... decide[d] the agenda.”

In his article for The Atlantic, Russell Berman also applauded the site’s effort “to make bottom-up, user-generated questions the centerpiece of a debate.” But possibly more significant were the results of this crowd-sourced content. “What transpired was, by all accounts, a decent debate,” Berman writes. “For 75 minutes, Grayson and Jolly addressed several weighty policy disputes—money in politics, Wall Street reform, the minimum wage, climate change, the solvency of Social Security—and often in detail.”

The Florida debate was streamed live on Monday to more than 80,000 viewers. The Open Debate platform is receiving attention and interest from various potential debate sponsors as well as the Commission on Presidential Debates for possible use in the in this fall’s presidential elections.

Caktus GroupES6 For Django Lovers

ES6 for Django Lovers!

The Django community is not one to fall to bitrot. Django supports every new release of Python at an impressive pace. Active Django websites are commonly updated to new releases quickly and we take pride in providing stable, predictable upgrade paths.

We should be as adamant about keeping up that pace with our frontends as we are with all the support Django and Python put into the backend. I think I can make the case that ES6 is both a part of that natural forward pace for us, and help you get started upgrading the frontend half of your projects today.

The Case for ES6

As a Django developer and likely someone who prefers command lines, databases, and backends you might not be convinced that ES6 and other Javascript language changes matter much.

If you enjoy the concise expressiveness of Python, then ES6's improvements over Javascript should matter a lot to you. If you appreciate the organization and structure Django's common layouts for projects and applications provides, then ES6's module and import system is something you'll want to take advantage of. If you benefit from the wide variety of third-party packages the Python Package index makes available to you just a pip install away, then you should be reaching out to the rich ecosystem of packages NPM has available for frontend code, as well.

For all the reasons you love Python and Django, you should love ES6, too!

Well Structured Code for Your Whole Project

In any Python project, you take advantage of modules and packages to break up a larger body of code into sensible pieces. It makes your project easier to understand and maintain, both for yourself and other developers trying to find their way around a new codebase.

If you're like many Python web developers, the lack of structure between your clean, organized Python code and your messy, spaghetti Javascript code is something that bothers you. ES6 introduces a native module and import system, with a lot of similarities to Python's own modules.

import React from 'react';

import Dispatcher from './dispatcher.jsx';
import NoteStore from './store.jsx';
import Actions from './actions.jsx';
import {Note, NoteEntry} from './components.jsx';
import AutoComponent from './utils.jsx'

We don't benefit only from organizing our own code, of course. We derive an untold value from a huge and growing collection of third-party libraries available in Python and often specifically for Django. Django itself is distributed in concise releases through PyPI and available to your project thanks to the well-organized structure and the distribution service provided by PyPI.

Now you can take advantage of the same thing on the frontend. If you prefer to trust a stable package distribution for Django and other dependencies of your project, then it is a safe bet to guess that you are frustrated when you have to "install" a Javascript library by just unzipping it and committing the whole thing into your repository. Our Javascript code can feel unmanaged and fragile by comparison to the rest of our projects.

NPM has grown into the de facto home of Javascript libraries and grows at an incredible pace. Consider it a PyPI for your frontend code. With tools like Browserify and Webpack, you can wrap all the NPM installed dependencies for your project, along with your own organized tree of modules, into a single bundle to ship with your pages. These work in combination with ES6 modules to give you the scaffolding of modules and package management to organize your code better.

A Higher Baseline

This new pipeline allows us to take advantage of the language changes in ES6. It exposes the wealth of packages available through NPM. We hope it will raise the standard of quality within our front-end code.

This raised bar puts us in a better position to continue pushing our setup forward.

How Caktus Integrates ES6 With Django

Combining a Gulp-based pipeline for frontend assets with Django's runserver development web server turned out to be straightforward when we inverted the usual setup. Instead of teaching Django to trigger the asset pipeline, we embedded Django into our default gulp task.

Now, we set up livereload, which reloads the page when CSS or JS has been changed. We build our styles and scripts, transforming our Less and ES6 into CSS and Javascript. The task will launch Django's own runserver for you, passing along --port and --host parameters. The rebuild() task delegated to below will continue to monitor all our frontend source files for changes to automatically rebuild them when necessary.

// Starts our development workflow
gulp.task('default', function (cb) {
  livereload.listen();

  rebuild({
    development: true,
  });

  console.log("Starting Django runserver http://"+argv.address+":"+argv.port+"/");
  var args = ["manage.py", "runserver", argv.address+":"+argv.port];
  var runserver = spawn("python", args, {
    stdio: "inherit",
  });
  runserver.on('close', function(code) {
    if (code !== 0) {
      console.error('Django runserver exited with error code: ' + code);
    } else {
      console.log('Django runserver exited normally.');
    }
  });
});

Integration with Django's collectstatic for Deployments

Options like Django Compressor make integration with common Django deployment pipelines a breeze, but you may need to consider how to combine ES6 pipelines more carefully. By running our Gulp build task before collectstatic and including the resulting bundled assets — both Less and ES6 — in the collected assets, we can make our existing Gulp builds and Django work together very seamlessly.

References

Tim HopperBackpacking for the Very Tall

I created a single page website to collect notes on one of my other hobbies: ultralight backpacking. In particular, notes on ultralight gear for the very tall.

Caktus GroupKittens at Caktus: Raising Money and Awareness for Local Charity Alley Cats and Angels

This week we welcomed three kittens into our office as part of a campaign to raise funds and awareness for local non-profit Alley Cats and Angels. This organization is dedicated to improving the lives of stray, abandoned, and feral cats. They also work to reduce the number of homeless cats in the Triangle through adoption, farm cat, and spay/neuter assistance programs.

Kittens at Caktus (affectionately called Catkus Day), was a direct result of Lead Developer and Technical Manager Karen Tracey’s volunteer efforts with Alley Cats and Angels. After seeing some of her adorable fosters, the team suggested that they come to our office for a day.

As an overwhelmingly cat-friendly office and a long-time supporter of Alley Cats and Angels, inviting kittens into our office turned out to be a fun, new way to support a great cause.

Follow the link to donate to Alley Cats and Angels or to find other opportunities for helping stray and abandoned cats and kittens in the Triangle.

NC Nwoko with kitten

Daryl Riethof with kitten

Caktus GroupFlorida Open Debate Site Powers First-Ever Crowd-Sourced Open Senate Debate

Florida Open Debate launched ahead of the upcoming, bi-partisan debate between candidates for the Florida Senate. The site, which crowdsources debate questions from the general public, was met with national acclaim. Citizens can not only submit questions, but also vote on which ones matter most. Caktus helped develop the tool on behalf of the Open Debate Coalition (ODC), a non-partisan organization dedicated to supporting participatory democracy through the use of civic tech.

The coalition formed during the 2008 presidential election to combat a sharp decline in voter participation as well as lack of representation in the debate arena. According to The Los Angeles Times “the job of designing and choosing questions is left to the media host.” The ODC recognized the need to increase participation from and provide access to the debate system by reaching as many American citizens as possible.

"The tool truly is an open forum for US citizens to participate in the political process,” says Ben Riseling, Caktus project manager. “Anyone can submit a question and vote on which questions should or should not be discussed. We’re extremely honored to be asked to participate in making this site available during this election season.”

The debate between Florida Senate candidate David Jolly (R) and Alan Grayson (D) takes place Monday, April 25th at 7:00 pm EDT and will be live streamed on the Florida Open Debate site itself.

Jeff TrawickBrief notes on replacing the HD in an Asus M32CD with SSD


The goal: Clone the Windows 10 OS partition from the original 1GB hard disk to a new 480GB Crucial SSD

The 'net said to use the Clone feature of Todo Backup Free 9.1. I tried. The clone didn't work. Both the sector for sector copy and the SSD optimize options were selected. The problem symptom after switching cables was that the machine didn't boot but instead loaded the Asus BIOS screen. (What may have been different for me: GPT instead of MBR, the particular arrangement of partitions on the 1TB source drive, user error, etc.)

The 'net said to use AOMEI Partition Assistant Standard Edition. I tried. This free version does not support GPT partitions, so it wouldn't do anything. I didn't pay up to get the version that does.

The 'net said to use Paragon Migrate OS to SSD 4.0. PC Magazine had a review that suggested that it would be easy to use for my situation. (That is not different than what I read before about other solutions.) I paid the $19.95. It just worked, no drama. Done.

Caktus GroupShipIt Day Recap: Q2 2016

Last Friday, the Cakti set aside regular client projects for our quarterly ShipIt Day, a chance for personal development and independent projects. People work individually or in groups to flex their creativity, tackle interesting problems, or expand their personal knowledge. This quarter’s ShipIt Day was all about open source contributions, frontend fixes, and learning new (or revisiting old) programming languages. Read about the various ShipIt Day projects for Q2 of 2016 below.


Dan Poirier decided to teach himself a new programming language for ShipIt Day and made his way through Learn You a Haskell for Great Good. Despite its mystifying title, Dan found the guide very helpful for learning this purely functional programming language. He ended the day by writing a “hello, world!” program as well as a program to count words and lines of an input.

Like Dan, Neil Ashton decided to play around with a purely functional programming language, throwing himself into a refresher with Clojure. By the end of the day he felt comfortable working his way around Clojure’s libraries and built a basic web application using Clojure for the backend and ClojureScript for the frontend. He especially enjoyed the way ClojureScript plugin Figwheel enabled him to edit code with instantaneously visible results.

David Ray wrote a Python script for automating parts of the sales process in Google Drive. His work greatly improved internal processes for the sales team. As always, it was APIs to the rescue! He is now looking into integrating with other 3rd party services to automate even more of the process, with the end goal of creating a bot to generate the scaffolding from our HipChat room.

Inspired by a recent workshop hosted by the [Triangle User Experience Professionals Association TriUXPA on designing better meetings, NC Nwoko teamed up with Tania Lee. The two used UX Design principles to generate ways to improve meetings here at Caktus. They focused on ways to prioritize ideas to determine meeting agendas, make meetings more visual, improve meeting experience for those working remotely, ask better questions, and become better meeting facilitators.

Caktus meetings improvement notes

Hunter MacDermut developed a game using Phaser that combines fitness and gaming. Using the FitBit’s API, Hunter created a game requiring in-app purchases in the form of FitBit steps. In other words, the games uses up a bank of total steps the user can build up throughout the day. Once this bank is used up, the user has to take more steps in order to continue game play.

Vinod Kurup took a look at Lambda, Amazon’s serverless computing service, which requires that the user only write one function. Lambda then provides the infrastructure to scale that function up and down as needed. Vinod explored the benefits, functionality, and scalability of the service for potential use in later projects. He also found out that Mark Lavin had created a Pingdom-type project using Lambda.

Erin Mullaney, Mark Lavin, and Dmitriy Chukhin collaborated to build a front end application with map and chart data visualizations using javascript and Python. The data for these visualizations was drawn from a RapidPro SMS survey taken live during their presentation. The project will ultimately be a part of the Caktus booth at this year’s PyCon!

dogs versus cats pie chart

In preparation for her sponsor workshop at PyCon this year with Erin, Leveraging Text Messaging in 2016 with RapidPro, Rebecca Muraya reviewed the history of RapidSMS and RapidPro as well as Caktus’ involvement in and relationship to these platforms.

Kia Lam returned to front-end basics to play around with CSS. She decided to recreate the Caktus logo using only HTML and CSS. She substituted the Caktus logotype with a Google Web font with similar characteristics. Using the power of CSS pseudo-elements she was able to add the necessary styling and elements of the Caktus logo without adding any additional markup to her HTML. She then positioned each element absolutely within a relatively positioned container, therefore ensuring that the styling elements would always be positioned perfectly in relation to the logotype. For the finishing touch, she applied an “eraser” container that would make it appear as if the bottom half of the styling elements were cut off to create the desired effect. You can see her work on codepen

Calvin Spealman made improvements to Caktus’ Django Project Template around our frontend build tooling. He replaced it with a few lines of code that enable the sharing of build tasks between multiple projects and developed a library for the Cakti to use across future projects.

Fascinated by images he had seen altering photographs with convolutional neural networks, Jeff Bradberry familiarized himself with the available Neural Network libraries including Theano, Caffe, and Torch.

Charlotte Fouque evaluated a few cross-browser testing tools and settled on Browsera to find bugs, frontend mistakes, and javascript errors and clean up a few websites. She also took some time to play around with Selenium to automate web functionality testing.

Tobias McNulty took the extra time to contribute back to the open source Django community. He modified and reviewed several tickets and gave a brief talk on why contributing to open source is so important.

Alex Lemann and Victor Rocha helped out Code for Durham by continuing to build additions to the Durham School Navigator. The two continued to build a survey tool to improve the site’s school profiles and worked on the layout of the profiles.

Colin Copeland improved the UX to OpenDataPolicingNC, a project he launched in cooperation with the Southern Coalition for Social Justice as well as a team of volunteer developers. He also built a sample homepage for possible expansion of the open data policing tool to other states.

OpenDataPolicingNC

Caktus GroupFrom Intern to Professional Developer: Advice on a Mid-Career Pivot

A few weeks ago, Rebecca Conley attended DjangoCon Europe 2016 in Budapest, Hungary. The event is a five-day conference that brings together Django lovers from all over the world to learn about and share each other’s experiences with Django.

Rebecca gave an inspiring talk on her transition into web development from other fields, including the non-profit sector. She approached the topic from a unique diversity in tech perspective, arguing that developers making such transitions have a great deal to offer the developer community as a whole. You can watch her talk below or check out the many other great talks here.

Tim HopperEcontalk

Listening to Russ Roberts' Econtalk podcast for the last 5 years has given me a whole new perspective on the world. Roberts has exposed me to a whole new way of economic thinking, refined my scientific skepticism, and introduced me to copious topics and scholars.

Here are episodes from over the years that I've particularly enjoyed.

One of my favorite guests is Duke economist Mike Munger. Here are some great interviews with him:

Incidentally, Priceonomics recently published a great article on Roberts and Econtalk.

Caktus GroupAdopting Scrum in a Client-services, Multi-project Organization

Caktus began the process of adopting Scrum mid-November 2015 with two days of onsite Scrum training and fully transitioned to a Scrum environment in January 2016. From our original epiphany of “Yes! We want Scrum!” to the beginning of our first sprint, it took us six weeks to design and execute a process and transition plan. This is how we did it:

Step 1: Form a committee

Caktus is a fairly flat organization and we prefer to involve as many people as possible in decisions that affect the whole team. We formed a committee that included our founders, senior developers, and project managers to think through this change. In order for us to proceed with any of the following steps, all committee members had to be in agreement. When we encountered disagreement, we continued communicating in order to identify and resolve points of contention.

Step 2: Identify an approach

Originally we planned to adopt Scrum on a per-project basis. After all, most of the literature on Scrum is geared towards projects. Once we started planning this approach, however, we realized the overhead and duplication of effort required to adopt Scrum on even four concurrent projects (e.g. requiring team members to attend four discrete sets of sprint activities) was not feasible or realistic. Since Caktus works on more than four projects at a time, we needed another approach.

It was then that our CEO Tobias McNulty flipped the original concept, asking “What if instead of focusing our Scrum process around projects, we focused around teams?” After some initial head-scratching, some frantic searches in our Scrum books, and questions to our Scrum trainers, our committee agreed that the Scrum team approach was worth looking into.

Step 3: Identify cross-functional teams with feasible project assignments

Our approach to Scrum generated a lot of questions, including:

  • How many teams can we have?
  • Who is on which team?
  • What projects would be assigned to which teams?

We broke out into several small groups and brainstormed team ideas, then met back together and presented our options to each other. There was a lot of discussion and moving around of sticky notes. We ended up leaving all the options on one of our whiteboards for several days. During this time, you’d frequently find Caktus team members gazing at the whiteboard or pensively moving sticky notes into new configurations. Eventually, we settled on a team/project configuration that required the least amount of transitions for all stakeholders (developers, clients, project managers), retained the most institutional knowledge, and demonstrated cross-functional skillsets.

Step 4: Role-to-title breakdown

Scrum specifies three roles: Development team member, Scrum Master, and Product Owner. Most organizations, including Caktus, specify job titles instead: Backend developer, UI developer, Project Manager, etc. Once we had our teams, we had to map our team members to Scrum roles.

At first, this seemed fairly straightforward. Clearly Development team member = any developers, Scrum Master = Project Manager, and Product Owner = Product Manager. Yet the more we delved into Scrum, the more it became obvious that roles ≠ titles. We stopped focusing on titles and instead focused on responsibilities, skill sets, and attributes. Once we did so, it became obvious that our Project Managers were better suited to be Product Owners.

This realization allowed us to make smarter long-term decisions when assigning members to our teams.

Step 5: Create a transition plan

The change from a client-services, multi-project organization to a client-services, multi-project organization divided into Scrum teams was not insignificant. In order to transition to our Scrum teams, we needed to orient developers to new projects, switch out some client contacts, and physically rearrange our office so that we were seated roughly with our teams. We created a plan to make the necessary changes over time so that we were prepared to start our first sprints in January 2016.

We identified which developers would need to be onboarded onto which projects, and the key points of knowledge transfer that needed to happen in order for teams to successfully support projects. We started these transitions when it made sense to do so per project per team, e.g., after the call with the client in which the client was introduced to the new developer(s), and before the holder of the institutional knowledge went on holiday vacation.

Step 6: Obtain buy-in from the team

We wanted the whole of Caktus to be on board with the change prior to January. Once we had a plan, we hosted a Q&A lunch with the team in which we introduced the new Scrum teams, sprint activity schedules, and project assignments. We answered the questions we could and wrote down the ones we couldn’t for further consideration.

After this initial launch, we had several other team announcements as the process became more defined, as well as kick-off meetings with each team in which everyone had an opportunity to choose team names, provide feedback on schedules, and share any concerns with their new Scrum team. Team name direction was “A type of cactus”, and we landed on Team Robust Hedgehog, Team Discocactus, and Team Scarlet Crown. Concerns were addressed by the teams first, and if necessary, escalated to the Product Owners for further discussion and resolution.

On January 4, 2016, Caktus started its first Scrum sprints. After three months, our teams are reliably and successfully completing sprints, and working together to support our varied clients.

What we’ve learned by adopting Scrum is that Scrum is not a silver bullet. What Scrum doesn’t cover is a much larger list than what it does. The Caktus team has earnestly identified, confronted, and worked together to resolve issues and questions exposed by our adoption of Scrum, including (but not limited to):

  • How best to communicate our Scrum process to our clients, so they can understand how it affects their projects?
  • How does the Product Strategist title fit into Scrum?
  • How can we transition from scheduling projects in hours to relative sizing by sprint in story points, while still estimating incoming projects in hours?
  • How do sales efforts get appointed to teams, scheduled into sprints, and still get completed in a satisfactory manner?
  • What parts of Scrum are useful for other, non-development efforts at Caktus (retrospectives, daily check-ins, backlogs, etc)?
  • Is it possible for someone to perform the Scrum Master on one team and Product Owner roles on a different team?

Scrum provides the framework that highlights these issues but intentionally does not offer solutions to all the problems. (In fact, in the Certified ScrumMaster exam, “This is outside the scope of Scrum” is the correct answer to some of the more difficult questions.) Adopting Scrum provides teams with the opportunity to solve these problems together and design a customized process that works for them.

Scrum isn’t for every organization or every situation, but it’s working for Caktus. We look forward to seeing how it continues to evolve to help us grow sharper web apps.

Caktus GroupWhat We're Clicking - March Link Roundup

We’re starting a new, monthly series on the Caktus blog highlighting the articles and posts shared by Cakti that drew the most attention on Twitter. These roundups will include everything from Django how-tos to explorations of the tech industry, to innovations for social good.

This month we’re featuring articles on front-end development, data visualization, open source, and diversity in tech.

python for geospatial data processing

Python for Geospatial Data Processing

An excellent how-to from Carlos de la Torre on satellite images classification and geospatial data processing in Python.

Why Do Many Data Scientists Love Using Python Over Ruby?

We obviously love Python. But if you’re not convinced, read Harri Srivastav’s post singing the praises of Python for data management, processing, and visualization. Python’s speed, availability of libraries, options for graphics, and its large, active community make it a standout for working with data.

Rachel Andrews’ Talks on Modern CSS Layout

Our lead front-end developer Calvin Spealman recommends all of Rachel Andrews’ talks on modern CSS layout. She addresses everything from Flexbox to Grid and Box Alignment as well as a handful of other front-end tools.

Should Your NGO Go Open Source?

Catherine Cheney’s article on the pros and cons of open source for NGOs sparked much discussion, especially on social media and at the most recent Digital Principles conference in DC.

a building in downtown Durham

Is a Different Kind of Silicon Valley Possible?

As a Durham-based tech firm invested in increasing diversity in tech, we were thrilled to see The Atlantic shout out Durham’s thriving tech scene and its attempts to address diversity from the ground up.

Caktus GroupCaktus expands anti-discrimination policies to include sexual orientation, gender identity

For the past few months, we’ve been working on broadening our anti-harassment and equal employment policy. We’re very pleased to announce that our policies now include sexual orientation and gender identity as of March 24, 2016.

This portion of our anti-harassment policy now reads:

Caktus will not tolerate any harassment of employees based on a person’s race, color, national origin, ancestry, religion, age, disability, gender, sexual orientation, gender identity, body, genetic information, marital status, political belief or activity, status as a veteran, or any other classification protected by law.

Our equal employment opportunity policy is:

It is a violation of Caktus policy to discriminate in the provision of employment opportunities, benefits or privileges; to create discriminatory work conditions; or to use discriminatory evaluative standards in employment if the basis of that discriminatory treatment is, in whole or in part, the person’s race, color, national origin, ancestry, religion, age, disability, gender, sexual orientation, gender identity, body, genetic information, marital status, political belief or activity, or status as a veteran. This policy applies to all areas of employment at Caktus. Discrimination in violation of this policy will be subject to disciplinary measures up to and including termination.

Fairness has always been an intrinsic part of our values, but greater inclusivity goes beyond fairness. We want each person at Caktus to feel welcomed. Our team should feel free to be themselves without fear. Openness is how innovation happens and, locally, helped drive Durham’s growth.

We’re deeply saddened at our state’s decision to pass HB2, but the resulting wave of public protests and the coming court challenges show that our past is not our future. We’re proud to be in excellent company with the many who believe that having an inclusive anti-harassment policy is simply the right thing to do.

Caktus GroupNew white paper: "Shipping faster: Django team improvements"

For the past couple months, we’ve been working on a new white paper, “Shipping Faster: Django Team Improvements”. We examined our existing processes, looked at best practices, and considered what has or hasn’t worked across our dozens of simultaneous projects.

Development teams need to deliver projects quickly and on budget, but often run into challenges that are beyond technical prowess. To build apps faster and more sustainably, we take a holistic look at both technical and environmental influences. Here are the four factors we found that lead to sharper app development:

  • Partnering with stakeholders
  • Focusing on business impact
  • Building apps that can grow with needs (clean code!)
  • Keeping your team sharp

We go into detail on each of these within the white paper, highlighting common challenges and tips to overcoming them. Download the white paper by clicking the button below. We look forward to hearing what you think!

Caktus GroupLightweight Django now in Portuguese!

We're proud to report that Lightweight Django (O'Reilly Media) is now available in Portuguese as Django Essencial. The book was written by our technical director Mark Lavin and Caktus alumnus Julia Elman to great reviews. Django Essencial comes just in time for Mark's keynote talk during PyCon Nordeste.

Purchase Django Essencial here.

Mark Lavin holding Django Essencial

Django Essencial page

 

Caktus GroupLightweight Django now in Portuguese!

We're proud to report that Lightweight Django (O'Reilly Media) is now available in Portuguese as Django Essencial. The book was written by our technical director Mark Lavin and Caktus alumnus Julia Elman to great reviews. Django Essencial comes just in time for Mark's keynote talk during PyCon Nordeste.

Purchase Django Essencial here.

Mark Lavin holding Django Essencial

Django Essencial page

 

Caktus GroupChecking That It's All Translatable

When building a translated application, it's important to test that all of the text is going to be translated, but difficult to tell until the translation has been done. Until then, even when you switch languages you still see English everywhere. It's not until all the text that's been set up to be translated actually is that you can see the site in the other language, at which point the English messages stick out like a sore thumb. But that's usually very late in the process. How can we catch those errors earlier?

One trick that works surprisingly well is to do a "fake" translation. If you can programmatically modify your English text in a recognizable way and pretend that's your actual translation, when you run the site you can see the messages that have not been modified, and know they need to be marked for translation.

(I didn't come up with this idea, but saw it used in a previous job to great effect.)

Here's how I'm doing this for Django.

  1. Process the messages in the application to a .po file:

    python manage.py makemessages  -l en
    
  2. Run the fake translation tool, taking the English .po file as input and producing a .po file for another language as output:

    python fake_translate.py locale/en/LC_MESSAGES/django.po locale/ar/LC_MESSAGES/django.po
    
  3. Compile the new .po file:

    python manage.py compilemessages
    
  4. Run the site, switch to the "translated" language and look for untranslated strings:

    python manage.py runserver
    

We've skipped over the fake translation tool, so now let's see how we can build that. The strategy will be to read the English .po file, go through the messages and "translate" each one, then write a new .po file.:

#!/usr/bin/env python
# -*- python -*-

# Usage: python fake_translate.py inputfile.po outputfile.po

import sys

import polib

def translate(s):
    TBD

po = polib.pofile(sys.argv[1])
for entry in po:
    if entry.msgid_plural:
        entry.msgstr_plural = {
            0: translate(entry.msgid),
            1: translate(entry.msgid_plural),
        }
    elif entry.msgid:
        entry.msgstr = translate(entry.msgid)
po.save(sys.argv[2])

There's one Python package it depends on, polib, which handles reading and writing the .po files for us.

I won't go into the reason for the special handling of the plural case. If you're curious, you can read all about how .po files work here.

It just remains to decide how we will "translate" the messages. We want the messages to still be readable, but to mark them somehow as "translated". So this will almost do it:

def translate(s):
    return "**%s**" % s

This just puts "**" at the beginning and end of each message, which makes it easy to spot any text in the site that hasn't been "translated".

The only problem occurs if any messages have leading whitespace. compilemessages is smart enough to make sure that the translated messages still start with the same whitespace (to catch translation errors, presumably), and if we put "**" in front of the whitespace, it raises a fatal error. To preserve any leading whitespace, we end up with:

def translate(s):
    if s[0].isspace():
        return s[0] + translate(s[1:])
    return "**%s**" % s

Caktus GroupCaktus Internship Fuels a Career Re-Launch

What is it like to be an intern at Caktus? I am finishing up as the spring Django intern (though I didn't actually use any Django), so I'll share my experience.

What brought me to this point

As a long-time stay-at-home mom re-entering the workforce, I was apprehensive about how my lack of recent professional work experience and references would affect my job search. To update my skillset, I completed the intensive 12-week Python course at The Iron Yard. I was hoping to find work in the tech industry I had left behind before having three children.

I was fortunate to be offered a 12-week part-time internship at Caktus. On my application and in my interview, I had expressed an interest in exploring HTML and CSS in more depth than I had had a chance to do at The Iron Yard. And that is how I have spent the majority of my time here. My first task was to implement a style guide for the Caktus website. A generic style guide existed but needed to be customized to reflect the Caktus website.

What is a style guide?

Style guides do several things:

  • Give direction to designers and developers about the expected look and feel of a website
  • Serve as a basis from which to discuss additions and changes so that everyone has the same reference point
  • Provide an inventory of what exists for anyone to access (no special permissions needed as it is not an active part of a website - could show it to a client)
  • Provide drop-in chunks of code / promote code reuse
  • Help identify and prevent inconsistencies across the website
  • In some cases, it can enable more accurate estimates of time and resources for new projects

What I did as an intern

I spent some time studying up on HTML, LESS, and CSS. Then Calvin, Caktus’ lead front-end developer, helped me get started by introducing me to the style guide in its generic form, and walking me through what needed to be done by styling a button from the Caktus website.

I began exploring the website to find what needed to be styled. I dove into the code to see what made things tick, and where the differences were between elements that had a similar style. The goal was to replicate them in the style guide so that anyone adding a button or card to the website would have the code at their fingertips, creating uniformity across the website and making their job faster and easier.

The style guide ended up including buttons, blog, press and case study cards and the fields from the Contact Caktus form. Some things were pretty quick and easy once I learned my way around. For example, I had known about the capability to inspect an element but didn’t know how to use the information found there. I learned a lot about classes and how HTML and CSS/LESS work together. This was more like a logic puzzle than work!

When I finished the style guide, my team was at the end of a sprint, and I asked for something to do in the short time before the next sprint started. It was a natural progression for me to use JIRA, Caktus’ issue and project tracking system, to log the bugs and inconsistencies I had come across during my travels through the Caktus website for the style guide project. Since I have volunteered over the years as an editor for various newsletters, inconsistencies pop out at me, and it was gratifying to log them.

My next assignment took me from working on an internal document to doing things that would show up in the real world.

I fixed several different bugs and issues with the website, but one of the most challenging was to fix links that were not working appropriately and/or consistently. Some didn’t underline when hovered over. Some underlined twice. A few returned a 404. None of the external links opened in a new tab or window. Links within the text of a blog, press release or case study had not been treated consistently.

Again, it was more like a game or a logic puzzle to fix these.

What I took away

As someone who had been out of the professional workforce for years, simply being in a more modern tech environment was educational - tools such as the issue tracking system, Agile methods, and even the chance to use Github in a real work environment were new for me. My co-workers, especially Karen, Erin, Calvin and Dmitriy never assumed that I didn’t know something, but also never assumed that I did. They checked up on me to make sure my internship was what I wanted it to be. The environment truly supported someone just starting a coding career.

I received a full-time offer from Blackboard and have to leave the internship after only nine weeks, which is both good and bad news. This internship opportunity is rare and valuable; Caktus simply wants to help junior-level programmers gain skills in a supportive environment which will make them better prepared to make real contributions in the workforce. I feel like I was really getting my feet under me, and the last 3 weeks would have been explosive in what I would learn. Still, I can’t say enough about how glad I am to have been chosen for this internship.

Caktus GroupBest Python Libraries

Our love for Python is no secret. But with so many modules, tools, and libraries, it can be overwhelming for beginning developers to identify the most useful. Obviously, our favorite framework is Django. But we’re setting aside our undying love for that framework for a moment to offer a list of other helpful Python libraries. From those offering standard solutions to everyday programming problems, to ones that hold a special place in the heart of the dev who created them, these are some of our developer’s favorite tools for Python development.

Calvin Spealman

coverage.py measures code coverage for Python programs. Tests are important, but too often overlooked. Coverage gives you a way to create a benchmark for how much of a project’s code you’re testing and for improving from there.

straight.plugin is the most widely used open source project Calvin has written, so he’s proud to see it pop up in the wild now and then. It provides a type of plugin a developer can create from almost any existing Python module as well as an easy way for outside developers to add functionality and customization to projects with their own plugins.

Victor Rocha

Requests is an HTTP library for Python. It makes it very simple to interact with APIs and crawl pages.

BeautifulSoup pulls data out of HTML and XML files. It enables a parser to provide Pythonic idioms for iterating, searching, and modifying a parse tree. Victor loves it for how it allows him to manipulate the Document Object Model tree in amazing ways when needed.

Mock is a library for testing and mocking Python code which was recently incorporated into the standard library in Python 3.3, so if you’re using python 3.3 or newer, you can use it right off the bat. It seems all the Cakti value rigorous testing.

Mark Lavin

flake8 is a wrapper around pyflakes and pep8. It carries an additional feature for detecting overly complex code. As writers of clean, simple code, the Cakti love it!

Jeff Bradberry

NumPy is the main scientific library in Python, offering comparable functionality to MATLAB. A number of other science or data related Python libraries make use of it, such as SciPy, Pandas, and scikit-learn. Numpy provides multi-dimensional arrays and fast operators and routines for manipulating these arrays.

pandasis a data analysis and manipulation library for Python. It offers flexible slicing and filtering as well as merging and reshaping of data.

matplotlib is a plotting library that generates beautiful plots and visualizations like histograms and various other charts and graphs. This, combined with the two libraries above in IPython Notebook (aka Jupyter), provide the user with a powerful and easy to use set of data analysis tools."

Rebecca Conley

ipdb exports functions to access the IPython debugger. Rebecca prefers ipdb over pdb because it gives you tab completion and object introspection.

Rebecca also loved working with Markovify to build the TayTay Lyric Generator for one of our Caktus ShipIt Days. A simple, extensible Markov chain generator, Markovify completes Markov chain mathematics in a transparent way, enabling the user to learn how the math works by simply looking at the source code for the library.

Caktus GroupTime for Flexbox First

The web development community has a habit of declaring "firsts" those practices and approaches that reach some ill-defined status signaling they are the go-to way to solve a particular problem. We've seen "mobile first" and, more recently, "offline first." In these examples, a new problem comes along and as that problem grows more common there comes a tipping point. On the other side of that tipping point, it begins to make sense to solve the problem from the ground up, rather than building a project and solving it as an afterthought.

As a concrete example, for "mobile first" this meant realizing it's a lot more difficult to retrofit mobile layouts into existing desktop sites than it is to build mobile websites that expand to fit larger, more capable devices.

With "offline first," we focus on building offline-availability into our web applications by constructing them to function totally in the browser and primarily offline. This way, online features become secondary and optional, similar to building a traditional native application.

The pattern is that when it’s hard to add something to a project, like mobile layouts or offline features, it is often easier to approach the problem from the opposite direction. Applying the relatively new CSS Flexbox rules has had the same problems. It is difficult to add to existing websites with large established layouts. This is especially true if you need to support browsers with incomplete or non-existent flexbox support.

Flexbox is Ready

The flexbox standard went through a rough early phase. Its original incarnation began to be rolled out in browsers several years ago, only to be found wrought with problems so bad the spec was entirely replaced by what is the current CSS Flexbox Module. The transition for browsers that had already implemented some or all of the failed version meant a difficult couple of years waiting for vendors to catch up with the turn-about.

Today, Flexbox is well supported across every important browser including mobile versions of Safari, Chrome, Firefox, and Opera. Even as far back as Internet Explorer 10 includes flexbox support (with some well documented and easily avoided pitfalls).

The shortcomings in support across some browsers are easily outweighed by the vast improvement in the way we approach web page layout. More than that, by deciding flexbox is the primary way you'll lay out your site you afford a huge list of advantages.

Flexbox is Better

Flexbox

A better way to do layout

CSS has grown in fits and starts over more than a decade with often painful standardization processes. This has led to a hodge-podge of properties and behaviors from which web developers have had to piece together solutions that stand up on a wide range of devices and browsers.

Flexbox is one of the most concise and wholly formed standards to be added to CSS. Rather than attack a single problem, or provide a new, simple layout feature, Flexbox aims to provide a holistic set of rules you can use for a wide range of layouts. This also means Flexbox essentially deprecates the usefulness of a whole host of previous CSS properties.

With Flexbox, we have a set of rules for a reasonable layout we can apply to our whole site, to the unique layouts of special pages, and to the interactive layouts of widgets across our projects. We can reuse the techniques we learn more easily as well as better predict what is going to work and where the limitations and edge cases lie.

The few exceptions don't break that rule

Inevitably every abstraction is going to get leaky and Flexbox won't be an exception to that rule. But, when you have a coherent set of rules that gets you most of the way to your goal, those few exceptions are a lot easier to manage. Fallbacks for the few corner cases in browser support or an edge case layout you can't quite get to work within the flexbox system are much easier to apply peppered at the end of a very concise process that gets you almost everything in one go.

Flexbox has Caveats

Internet Explorer 10 and 11

Unsurprisingly, you'll get the roughest edges from Internet Explorer (and Edge). These caveats are relatively minor, though. There has also been some great work at documenting all the IE Flexbox issues and their workarounds. If you read up on this, you can establish your own guidelines for how to use Flexbox so that you won’t even hit any of these issues.

Old IE and other non-supporting browsers

If you're a lucky developer you only need to support the latest and greatest browsers with the best support for modern features. Most of us aren't that lucky all the time, so the necessity will come up to support browsers with older, partial, or non-existent Flexbox support. Addressing this is another article unto its own. If this concerns you about diving into Flexbox First, I hope it will calm those worries to know there are strong approaches to handling these browsers without totally disrupting the value you get out of Flexbox itself.

Begin with a focus on the layout and the browsers that support Flexbox best. That's a core rule of going Flexbox First. If you need to support older versions of Android browser, Mobile Safari, or Internet Explorer, take a look at them only after you've established a strong baseline in the modern browsers. Don't look at these as targets you need to replicate the design in, though. Supporting Flexbox layouts on older browsers that totally lack it is a cause for applying graceful degradation. If you focus on perfectly replicating the same layout on both compliant and noncompliant browsers, you are only duplicating an effort that entirely erases the advantages you'd otherwise have gained. Instead, focus on providing a good experience on par with those browsers. Treat this as another aspect of responsive design, in which we accept and react to the range of abilities between devices that our content gets pushed to.

Flexbox bug repository

All of these issues are relatively easy to work with when you're familiar with the problems up front, so I recommend reading over the currently documented list at the Flexbugs repository. Check the list every now and then to find if there are any new issues discovered, or old issues fixed in new updates to browsers.

Flexbox is Today

If I've convinced you and Flexbox isn't something you have much experience with yet, there are great guides available. You can find A Complete Guide to Flexbox from CSS Tricks available online. For a light introduction, everyone should check out the delightful Flexbox Froggy interactive tutorial.

Try the tutorial to get a feel for how Flexbox works. Absorb the reference guide to get an in-depth understanding of all the Flexbox mechanisms. Try building your next website using everything you learn.

Joe GregorioAn update on httplib2

It's March 2016 and that means I've been building and maintaining httplib2 for over 10 years. As with most of my best software, httplib2 was initially a rage-based project, fed by my disgust on the state of HTTP client libraries in 2006.

In the past 10 years it's gathered quite a following, with 0.9.2 current being downloaded from PyPI:

   37297 downloads in the last day
   210596 downloads in the last week
   830573 downloads in the last month
  

I'm done.

I've been programming exclusively in Go for the past four years, and except for the occasional security bug to fix in httplib2 I've avoided using Python like the plague.

So I could go on being a bad package maintainer, or I can hand over the project to people that are more invested in keeping it working. If you are interested in being one of those people please ping me, via email or twitter and I will add you to https://github.com/httplib2. That’s a new github organization, but have done nothing beyond that, I'll leave it up to the new maintainers to decide about forking, importing bugs, etc. from the original project.

Caktus GroupWagtail: 2 Steps for Adding Pages Outside of the CMS

My first Caktus project went live late in the summer of 2015. It's a community portal for users of an SMS-based product called RapidPro. The portal was built in the Wagtail CMS framework which has a lovely, intuitive admin interface and excellent documentation for developers and content editors. The code for our Wagtail-based project is all open sourced on GitHub.

For this community portal, we needed to allow users to create blog pages on our front-facing site without giving those same users any level of access to the actual CMS. We also didn't want outside users to have to learn a new CMS just to submit content.

We wanted a simple, one-stop form that guided users through entering their content and thanked them for submitting. After these outside users requested pages be published on the site, CMS content editors could then view, edit, and publish the pages through the Wagtail CMS.

Here's how we accomplished this in two steps. Karen Tracey and I both worked on this project and a lot of this code was guided by her Django wisdom.

Step 1: Use the RoutablePageMixin for our form page and thank you page

Now for a little background information on Wagtail. The Wagtail CMS framework allows you to create a model for each type of page on your site. For example, you might have one model for a blog page and another model for a blog index page that lists out your blog pages and allows you to search through blog pages. Each page model automatically connects to one template, based on a naming convention. For example, if your model is called BlogIndexPage, you would need to also have a template called blog_index_page.html, so that Wagtail knows how to find the related template. You don't have to write any views to use Wagtail out of the box.

However, in our case, we wanted users to submit a BlogPage entry which would be a child of a BlogIndexPage. Therefore, we wanted our BlogIndexPage model to route to itself, to a submission page, and to a "thank you" page.

RapidPro blog workflow

This is where Wagtail's RoutablePageMixin came into play. Here's the relevant code from our BlogIndexPage model that routes the user from the list page to the submission page, then to the thank you page.

In models.py:

from django.template.response import TemplateResponse

from wagtail.wagtailcore.models import Page
from wagtail.wagtailcore.fields import RichTextField
from wagtail.contrib.wagtailroutablepage.models import RoutablePageMixin, route


class BlogIndexPage(RoutablePageMixin, Page):
    intro = RichTextField(blank=True)
    submit_info = RichTextField(blank=True)
    thanks_info = RichTextField(blank=True)

    @route(r'^$')
    def base(self, request):
        return TemplateResponse(
          request,
          self.get_template(request),
          self.get_context(request)
        )

    @route(r'^submit-blog/$')
    def submit(self, request):
        from .views import submit_blog
        return submit_blog(request, self)

    @route(r'^submit-thank-you/$')
    def thanks(self, request):
        return TemplateResponse(
          request,
           'portal_pages/thank_you.html',
           { "thanks_info" : self.thanks_info }
        )

The base() method points us to the blog index page itself. Once we added the RoutablePageMixin, we had to explicitly define this method to pass the request, template, and context to the related template. If we weren't using this mixin, Wagtail would just route to the correct template based on the naming convention I described earlier.

The submit() method routes to our blog submission view. We decided to use the URL string "submit-blog/" but we could have called it anything. We have a view method submit_blog() defined in our views.py file that does the work of actually adding the page to the CMS.

The thanks() method routes to the thank you page (thank_you.html) and passes in content editable via the CMS in the variable thanks_info as defined in the BlogIndexPage model.

Step 2: Creating the form and view method to save the user-generated information

Here's the slightly trickier part, because we didn't find any documentation on adding pages to Wagtail programmatically. We found some of this code by digging deeper through the Wagtail repo and found the tests files especially helpful. Here are the relevant parts of our code.

In forms.py, we added a Django ModelForm.

class BlogForm(forms.ModelForm):
    class Meta:
        model = BlogPage

In views.py, we created a view method called submit_blog() that does a number of things.

  1. Imports the BlogForm form into the context of the page.
  2. Upon submission/post, saves the BlogForm with commit=False, so that it is not saved to the database, yet.
  3. Creates a slug based on the title the user entered with slugify(). This would normally be auto-generated and editable in the Wagtail CMS.
  4. Adds the unsaved BlogPage as a child to the BlogIndexPage (we passed in the reference to the index page in our routable submit() view method).
  5. Saves the page with the unpublish() command which both saves the uncommitted data to our CMS and marks it as a Draft for review.
  6. Saves the revision of the page so that we can later notify the Wagtail admins that a new page is waiting for their review with save_revision(submitted_for_moderation=True)
  7. Finally, this sends out email notifications to all the Wagtail admins with send_notification(blog.get_latest_revision().id, 'submitted', None). The None parameter in this function means do not exclude any Wagtail moderators.
def submit_blog(request, blog_index):

      form = BlogForm(data=request.POST or None, label_suffix='')

      if request.method == 'POST' and form.is_valid():
          blog_page = form.save(commit=False)
          blog_page.slug = slugify(blog_page.title)
          blog = blog_index.add_child(instance=blog_page)
          if blog:
              blog.unpublish()
              # Submit page for moderation. This requires first saving a revision.
              blog.save_revision(submitted_for_moderation=True)
              # Then send the notification to all Wagtail moderators.
              send_notification(blog.get_latest_revision().id, 'submitted', None)
          return HttpResponseRedirect(blog_index.url + blog_index.reverse_subpage('thanks'))
      context = {
          'form': form,
          'blog_index': blog_index,
      }
      return render(request, 'portal_pages/blog_page_add.html', context)

Final Thoughts and Some Screenshots

Blog submission page

Front-end website for user submission of blog content.

Wagtail is very straightforward to use; we plan to use it on future projects. If you want to get started with Wagtail, the documentation is very thorough and well written. I also highly recommend downloading the open sourced demo site and getting that rolling in order to see how it's all hooked together.

Joe GregorioPiccolo Now Sports LaTeX Support

In the first of two enhancements to piccolo, LaTeX support has been added. Just put the LaTeX in a latex-pic element and it will be converted to a PNG, with the alt text set to the LaTeX code.

<latex-pic>E = mc^2</latex-pic>

Transforms into:

E = mc^2

This is mostly driven by me trying to learn Geometric Algebra and wanting to talk about things like:

e_{23} = e_2 \wedge e_3

Next up is adding inline Julia.

Frank WierzbickiJython 2.7.1 beta3 released!

On behalf of the Jython development team, I'm pleased to announce that the third beta of Jython 2.7.1 is available! This is a bugfix release. Bug fixes include improvements in zlib and pip support.

Please see the NEWS file for detailed release notes. This release of Jython requires JDK 7 or above.

This release is being hosted at maven central. There are three main distributions. In order of popularity:
To see all of the files available including checksums, go to the maven query for Jython and navigate to the appropriate distribution and version.

Jeff TrawickYour Errata Submission for ...


My phone screen flashed earlier this a.m. with the reception of e-mails indicating that a couple of fixes for typographical errors which I submitted some time ago for Fluent Python have been accepted by the author. I was motivated to submit them because of the near-perfection of the book, beautiful in concept and in implementation; it was my opportunity to help maintain a wonderful work. Just as importantly, the publisher made it easy.

The contribution of a quotation mark and a two-letter English word to a 740-page book is hardly remarkable. It is instead what is now an ordinary task made easy by the Internet — the same Internet that contributed immensely to the creation of the book in the first place, from the development and popularity of the subject matter of the book to the tools which were used to create it to the ability of a publisher to interact with more authors to the electronic marketing and commerce which resulted in my purchase.

This is not unlike the creation of the software we all use daily. A large amount of it has the mark of a company but it relies to a tremendous extent on open source software, easily obtained, usually easy to contribute to, and truly ubiquitous even in so-called proprietary software with which we interact. It works because of countless contributions big and small from developers all over the world, using collaboration methods made easy by the Internet. Ease of collaboration enables contributions which have further eased collaboration (not to mention the rest of electronic life), and the software industry is built on the result.

It is time to close the door on what has been called open source strategy, as the use of open source software and the need for strategies for appropriate consumption of the software and interaction with the communities has invaded even the darkest corridors of proprietary software development and become business as usual. All software projects are a fusion of open source and custom-built components, whether or not everyone involved acknowledges it.

I look forward to a refresh of my electronic copy of Fluent Python with the latest corrections. But since submitting those fixes to the book text, I've collaborated with a handful of open source projects in the Django ecosystem for the first time and seen most of my small contributions there accepted; I am still watching some of those for feedback from the project maintainer or for inclusion in a new release. Those contributions were an important day-job activity, enabling features that our customer requested which didn't quite fit into the existing application.

Possible upcoming book contribution — convince the Two Scoops of Django authors to rework their claim that you can't pass environment variables to Django apps running with Apache httpd :) I'm sure they think that Apache+Django implies mod_wsgi, and I guess it is not fun to pass through OS-level environment variables in that configuration. My 2 cents on that matter: Deploying Python Applications with httpd (PDF)

Caktus GroupWriting Unit Tests for Django Migrations

Testing in a Django project ensures the latest version of a project is as bug-free as possible. But when deploying, you’re dealing with multiple versions of the project through the migrations.

The test runner is extremely helpful in its creation and cleanup of a test database for our test suite. In this temporary test database, all of the project's migrations are run before our tests. This means our tests are running the latest version of the schema and are unable to verify the behavior of those very migrations because the tests cannot set up data before the migrations run or assert conditions about them.

We can teach our tests to run against those migrations with just a bit of work. This is especially helpful for migrations that are going to include significant alterations to existing data.

The Django test runner begins each run by creating a new database and running all migrations in it. This ensures that every test is running against the current schema the project expects, but we'll need to work around this setup in order to test those migrations. To accomplish this, we'll need to have the test runner step back in the migration chain just for the tests against them.

Ultimately, we're going to try to write tests against migrations that look like this:

class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

    def setUpBeforeMigration(self, apps):
        BlogPost = apps.get_model('blog', 'Post')
        self.post_id = BlogPost.objects.create(
            title = "A test post with tags",
            body = "",
            tags = "tag1 tag2",
        ).id

    def test_tags_migrated(self):
        BlogPost = self.apps.get_model('blog', 'Post')
        post = BlogPost.objects.get(id=self.post_id)

        self.assertEqual(post.tags.count(), 2)
        self.assertEqual(post.tags.all()[0].name, "tag1")
        self.assertEqual(post.tags.all()[1].name, "tag2")

Before explaining how to make this work, we'll break down how this test is actually written.

We're inheriting from a TestCase helper that will be written to make testing migrations possible named TestMigrations and defining for this class two attributes that configure the migrations before and after that we want to test. migrate_from is the last migration we expect to be run on machines we want to deploy to and migrate_to is the latest new migration we're testing before deploying.

class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

Because our test is about a migration, data modifying migrations in particular, we want to do some setup before the migration in question (0010_migration_being_tested) is run. An extra setup method is defined to do that kind of data setup after 0009_previous_migration has run but before 0010_migration_being_tested.

def setUpBeforeMigration(self, apps):
    BlogPost = apps.get_model('blog', 'Post')
    self.post_id = BlogPost.objects.create(
        title = "A test post with tags",
        body = "",
        tags = "tag1 tag2",
    ).id

Once our test runs this setup, we expect the final 0010_migration_being_tested migration to be run. At that time, one or more test_*() methods we define can do the sort of assertions tests would normally do. In this case, we're making sure data was converted to the new schema correctly.

def test_tags_migrated(self):
    BlogPost = self.apps.get_model('blog', 'Post')
    post = BlogPost.objects.get(id=self.post_id)

    self.assertEqual(post.tags.count(), 2)
    self.assertEqual(post.tags.all()[0].name, "tag1")
    self.assertEqual(post.tags.all()[1].name, "tag2")

Here we've fetched a copy of this Post model's after-migration version and confirmed the value we set up in setUpBeforeMigration() was converted to the new structure.

Now, let's look at that TestMigrations base class that makes this possible. First, the pieces from Django we'll need to import to build our migration-aware test cases.

from django.apps import apps
from django.test import TransactionTestCase
from django.db.migrations.executor import MigrationExecutor
from django.db import connection

We'll be extending the TransactionTestCase class. In order to control migration running, we'll use MigrationExecutor, which needs the database connection to operate on. Migrations are tied pretty intrinsically to Django applications, so we'll be using django.apps.apps and, in particular, get_containing_app_config() to identify the current app our tests are running in.

class TestMigrations(TransactionTestCase):

    @property
    def app(self):
        return apps.get_containing_app_config(type(self).__module__).name

    migrate_from = None
    migrate_to = None

We're starting with a few necessary properties.

  • app is a dynamic property that'll look up and return the name of the current app.
  • migrate_to will be defined on our own test case subclass as the name of the migration we're testing.
  • migrate_from is the migration we want to set up test data in, usually the latest migration that's currently been deployed in the project.
def setUp(self):
    assert self.migrate_from and self.migrate_to, \
        "TestCase '{}' must define migrate_from and migrate_to properties".format(type(self).__name__)
    self.migrate_from = [(self.app, self.migrate_from)]
    self.migrate_to = [(self.app, self.migrate_to)]
    executor = MigrationExecutor(connection)
    old_apps = executor.loader.project_state(self.migrate_from).apps

After insisting the test case class had defined migrate_to and migrate_from migrations, we use the internal MigrationExecutor utility to get a state of the applications as of the older of the two migrations.

We'll use old_apps in our setUpBeforeMigration() to work with old versions of the models from this app. First, we'll run our migrations backwards to return to this original migration and then call the setUpBeforeMigration() method.

# Reverse to the original migration
executor.migrate(self.migrate_from)

self.setUpBeforeMigration(old_apps)

Now that we've set up the old state, we simply run the migrations forward again. If the migrations are correct, they should update any test data we created. Of course, we're validating that in our actual tests.

# Run the migration to test
executor.migrate(self.migrate_to)

And finally, we store a current version of the app configuration that our tests can access and define a no-op setUpBeforeMigration()

    self.apps = executor.loader.project_state(self.migrate_to).apps

def setUpBeforeMigration(self, apps):
    pass

Here's a complete version:

from django.apps import apps
from django.test import TransactionTestCase
from django.db.migrations.executor import MigrationExecutor
from django.db import connection


class TestMigrations(TransactionTestCase):

    @property
    def app(self):
        return apps.get_containing_app_config(type(self).__module__).name

    migrate_from = None
    migrate_to = None

    def setUp(self):
        assert self.migrate_from and self.migrate_to, \
            "TestCase '{}' must define migrate_from and migrate_to properties".format(type(self).__name__)
        self.migrate_from = [(self.app, self.migrate_from)]
        self.migrate_to = [(self.app, self.migrate_to)]
        executor = MigrationExecutor(connection)
        old_apps = executor.loader.project_state(self.migrate_from).apps

        # Reverse to the original migration
        executor.migrate(self.migrate_from)

        self.setUpBeforeMigration(old_apps)

        # Run the migration to test
        executor.migrate(self.migrate_to)

        self.apps = executor.loader.project_state(self.migrate_to).apps

    def setUpBeforeMigration(self, apps):
        pass


class TagsTestCase(TestMigrations):

    migrate_from = '0009_previous_migration'
    migrate_to = '0010_migration_being_tested'

    def setUpBeforeMigration(self, apps):
        BlogPost = apps.get_model('blog', 'Post')
        self.post_id = BlogPost.objects.create(
            title = "A test post with tags",
            body = "",
            tags = "tag1 tag2",
        ).id

    def test_tags_migrated(self):
        BlogPost = self.apps.get_model('blog', 'Post')
        post = BlogPost.objects.get(id=self.post_id)

        self.assertEqual(post.tags.count(), 2)
        self.assertEqual(post.tags.all()[0].name, "tag1")
        self.assertEqual(post.tags.all()[1].name, "tag2")

Caktus GroupShipIt Day Recap: Q1 2016

Last Friday, the Cakti set aside regular client projects for our quarterly ShipIt Day, a chance for personal development and independent projects. People work individually or in groups to flex their creativity, tackle interesting problems, or expand their personal knowledge. This quarter’s ShipIt Day saw everything from cat animations to improvements on our Taylor Swift lyric generator app. Read about the various ShipIt Day projects for Q1 of 2016 below.


Neil Ashton worked with the open source DemocracyOS, a platform for collaborative decision making. The platform is used around the world to increase participation in democratic systems. Neil took a look at the basic web app stack for the platform and submitted several pull requests to fix bugs in that system. All of his pull requests have since been approved, making Neil an official contributor to DemocracyOS!

Calvin Spealman went into ShipIt Day with the intention of building a story generator. However, after running into a roadblock late Thursday evening, he transitioned instead to work on frontend tooling upgrades and related documentation in Caktus’ Django project template. In the meantime, Vinod upgraded one of Calvin’s projects to the latest version of Margarita. This enabled Calvin to demonstrate the new frontend tooling upgrades, while allowing Vinod to test his upgrade pathway documentation.

Like Calvin, Colin Copeland built in frontend changes to the Django project template and integrated those changes into an existing project.

cat animation still

Inspired by the work of Rachel Nabors, Erin Mullaney made a CSS3 animation out of one of her own drawings. By cycling through sprites of different images, she made an animation of a cat blinking while a mouse walked by. You can see the full animation here.

taylor swift lyric generator

Mark Lavin, Rebecca Conley, and Dmitriy Chukhin built on their work from a previous ShipIt Day, The Taylor Swift Song Generator. The team cleaned up some of the code and fixed the test suite, using Codecov for code test coverage and OpBeat for exception and performance monitoring. Mark also created Twitter preview cards for tweeting particularly choice TayTay lyrics generated by the app while Rebecca and Dmitriy enabled user-creation of a song title that would then become the first line of the chorus, making the app more interactive. By the end of the day the team was working on word visualization and cleaning up their data set, which are the main goals for the next chance the team has to work on the app.

Dan Poirier, Scott Morningstar, and Jeff Bradberry continued their work from a previous ShipIt Day as well, with the goal of getting the project template to deploy on Ansible in order to move away from Salt. They improved the usage of variables, added Supervisor, nginx, and gunicorn, pushed source code to a deployment target without going through github, and updated documentation. Though the team couldn’t get deployment to a virtual machine to work, they are incredibly close and hope to have a deploy by the next ShipIt Day!

Hunter MacDermut got into using es2015 with Gulp, Browserify, and Babel, building a task list that would auto-organize by priority. This to-do list app would be organized by project and task, and further taps on each item would increase that item’s priority value. Though Hunter didn’t have time to finish the sorting feature, in its current state, it is a functional to-do list app that relies on localStorage getting and setting. The repo for the app can be found here.

Though she didn’t technically participate in ShipIt Day, NC Nwoko did break from her usual routine and went to the Regional AIDS Interfaith Network (RAIN) in Charlotte, which helps youth get involved in their treatment. She helped train caseworkers in the use of our Epic Allies app.

durham school navigator

Victor Rocha continued his work on Code for Durham’s School Navigator app, building a new school profile to be tested in the community. David Ray added features and improvements to the mobile app version of the School Navigator, including geolocation functionality, a clear input button to make subsequent address searches more efficient, and technical debt cleanup, changing some code that was in triplicate to an angular directive.

Finally, Tobias McNulty worked on refactoring and cleaning up the deployment code for a client project. The project was based on an old Fabric-based deployment that Caktus used to use. He cleaned up the Fabric file, making it client agnostic, pulled out the configuration into a yml file, and merged the changes back into FabulAWS, the parent project. The next step will be to break these down into smaller files for autoscaling purposes. Meanwhile, Karen Tracey reviewed Tobias’ work.

Tim HopperMentions of John Cook on Github

People mention John Cook's blog a lot in Github repos.

I scraped the Github search pages to try to figure out which of his pages are most mentioned. His post Accurately computing running variance gets many more mentions than any other post. It provides C++ code for Knuth's algorithm for computing the mean, sample variance, and standard deviation for a stream of data.

Here are top 12 pages from his site most linked in Github:

  1. Accurately computing running variance (377 mentions)
  2. Stand-alone code for numerical computing (58 mentions)
  3. A Bayesian view of Amazon Resellers (52 mentions)
  4. johndcook.com/blog (47 mentions)
  5. Computing the distance between two locations on Earth from coordinates (44 mentions)
  6. Math.h in POSIX, ISO, and Visual Studio (38 mentions)
  7. Three algorithms for converting color to grayscale (26 mentions)
  8. johndcook.com (21 mentions)
  9. Computing skewness and kurtosis in one pass (20 mentions)
  10. What’s so hard about finding a hypotenuse? (19 mentions)
  11. Random number generation in C++ (19 mentions)
  12. R language for programmers (19 mentions)

Tim HopperQuotes from Former Professors

Tim HopperTweet Your Moon

I created a Twitter account that tweets a moon emoji each evening1 representing the current phase of the moon.

I've wanted to do this for a while, but Joel Grus' posts about AWS Lambda inspired me to make it happen.

The code for this is on Github.

I also created a Github project that provides a template for creating Lambda-powered Twitter bots in Python.


  1. Evening in the eastern United States. 

Caktus GroupThe Trials, Tribulations, and Triumphs of Choosing an M&E Platform

republished with permission from ICTWorks.org

At the recent MERL Tech conference, Tania Lee (Caktus Group), Tom Walker (Engine Room), Laura Walker McDonald (SIMLab), and Lynnae Day (Oxfam America) led a session called, “The Trials, Tribulations, and Triumphs of Choosing an M&E Platform.” They’ve written up their reflections and learning from that session focusing on project design, tool design/research, and getting things off the ground, whether that means finding external help or building a custom solution.

Choosing a tool: where to start?

Many organizations come to a procurement process with a favourite platform in mind, either from prior experience or because they’ve heard compelling marketing or stories. Starting from the tool means that you are not being true to what really matters – your needs and those of your users.

SIMLab and Caktus Group use an Agile methodology that prioritizes a general sense of direction and strong knowledge of the user, and uses that to get to a prototype that actual users can test as early as possible (whether you’re building a tool or configuring an existing one). Getting feedback on a system’s capabilities while you’re developing it lets you respond as your idea meets reality.

Understand your direction of travel, through workshops with your team, potential users and other stakeholders

  • Create and understand the theory of change for the tool thoroughly.
  • Write ‘user personas’ – a concise, clear picture of the people you are building for, including their habits, levels of technical skill and working environment.
  • Write or draw a process map that shows exactly what information is being exchanged, where it comes from, where it goes and how. This helps you to see potential blockers or impractical elements of your plan.
  • From the process map and user personas, develop user stories (individual tasks that the system should be able to perform). Prioritize those, and develop the first few.
  • As soon as you have something you can test, show a prototype system to real users.
  • When you can, use it ‘live’ in a pilot and continue to build and release new versions as you go.

Things to consider while going through this process:

  • Project timeframes: is this system constrained by a particular project timeframe, or does it need to be sustainable beyond the end of your project? Is it an internal operational system – in which case there may be no end date for its use?
  • What level of security do you need the system to have? What kind of data will it handle, and what are the legal requirements in the country (or countries) where it will be used?
  • What level of user support and documentation do you need?
  • Will users need training?
  • What level of system analytics are you hoping for?
  • How tolerant of risk are you/your organization? For example, do you need to choose an established vendor or product that you know will still be there five years from now?

Challenges

The session acknowledged some of the challenges with this kind of work:

  • In large organizations, working on internal projects requires consultation with a wide range of stakeholders.
  • Project managers are often trying to bridge gaps in knowledge (and even language) between technical and non-technical people. They may themselves not be technologists.
  • Donor funding may not allow the kind of flexibility inherent in an Agile development process. The proposal process requires you to have done much of the design work – or to presuppose its outcome – before you get the funds to support it.
  • During the development process, the environment and needs themselves may change.
  • No product or system is ever truly ‘finished’.
  • Systems development is rarely funded well enough to make something robust that has all non-functional requirements, and allows for maintenance and updates.

What’s already out there?

If you don’t already have a tool in mind, the next step is to conduct a market analysis. A market analysis helps you determine what tools are available and decide whether to use or customize an existing tool, or develop your own custom solution.

There are several common challenges organizations face in their search for M&E tools, including:

  • Finding a tool that’s an exact fit for all of your needs, particularly for larger projects; you’re likely to have to pick and choose.
  • Finding all of the relevant options quickly and easily, or…
  • …when you find a host of similar tools, quickly and easily understanding their specific differences and value-adds.
  • Developing user needs and tool requirements in an often-changing context (internal or external).

There are also several common approaches organizations take to navigating these challenges. Some of these include:

  • Staff that are familiar with existing tools or have experience with a specific tool that fits the requirements, can do some simple testing and make recommendations.
  • Taking to peer organizations to see what tools they’ve used (or not used, and why) for similar processes or projects.
  • Hiring consultants or working with IT departments to support the search, selection, and contracting.
  • Referencing some of the tool guides that exist (like NetHope or Kopernik), but without relying completely on them for decision-making, as they’re not always up-to-date or have all of the information you need.

Finally, there are some key questions to ask when assessing your final pool of solutions, besides whether they fit the bulk of your minimum user requirements, including:

  • How will the user interact with the system? What kind of software/hardware requirements and training will your users need to get the most out of their experience?
  • What kind of tech support is available – end user support? customization support? break-fixes? anything at all? And how much will varying levels of support cost?
  • What are the up-front costs (for software and hardware)? What are the on-going costs at the current project size? What are the on-going costs at scale (whatever the size of your “scale” is)?
  • If you’re planning to scale, how well does the tool adapt to different environments? For example, does it work offline and online? Does the interface support localization in the language that you need? Does it have the capacity to integrate with other tools?

You’ve done the groundwork: what next?

Once you’ve taken the time to understand your needs and look into what tools are already out there, you might find you still need features that an off-the-shelf tool can’t provide. If you don’t have the necessary skills within your organisation, you might need to consider getting technical support from outside your organisation, or building a custom solution.

Finding the right external help, if you need it

During the discussion, many people said that they found it difficult to know where to find good sources of external support. This seems to be a struggle in many contexts: it was also a common theme in the engine room’s research project (supported by Making All Voices Count) into how transparency and accountability initiatives in Kenya and South Africa choose tools.

Sometimes, this is just a question of networks and knowing who to ask. The discussion threw up a collection of useful places to start (such as Pelican Initiative) for this, and we’re currently collecting more. But there are some things that you can do to help pick the right support provider.

Choose criteria that make sense for your organisation. In the discussion, a lot of people said that the most important factor determining whether they could work well with a technical provider was how well the provider understood their organisation’s long-term goals. Providers of technology tools can sometimes focus on the tool itself rather than the organisation and its goals. For a relationship to be successful, there needs to be a clear understanding of how a tool fits into an organisation.

Be clear with your needs

This makes it critically important to present what you need to support providers in a clear, comprehensive way. Whether this takes the form of a formal request for proposals, a short brief or a specification document, communicating your needs to people outside your organisation is actually a good way of narrowing down which features are really essential – and which you can do without.

Writing this in language that technical providers can understand often takes a lot of time and effort. You may already have a lot of the raw material for a good brief if you’ve already documented your needs and got a good idea of existing products, including elements like user personas and user stories (see the Understanding your needs blog, above).

At this point, several members of the discussion said that they had found it helpful to work with a consultant who could translate their needs into technical specifications – which they could then distribute to actual providers. It’s worth looking at existing templates (like this one by Aspiration) and guides on the request-for-proposal process (like this one by TechSoup).

Different organisations will have different criteria for making a selection. Some might want having knowledge of a particular country context, experience in a particular sector or a more general sense of how the organisation was likely to grow and change over time. Others found it more helpful when providers were physically located in the same city as them, or available to answer queries at specific times. Several people in the discussions mentioned times when they hadn’t considered factors like these – and came to regret them later.

Building a custom solution

Building a custom solution, in this context, means building a tool (or integrating a set of tools) that meet needs that no other tools in the marketplace can. Deciding to build a custom tool may require weighing tradeoffs: short-term needs vs. long-term flexibility, specialized usability features vs. time to market, or unique program needs vs. organizational standards.

One of the most often repeated challenges identified by session participants was securing the necessary funding to properly build a custom solution. It can be difficult to convince donors to pay for technical resourcing of a software development project and/or maintenance and support of the system.

Sometimes, organizations will find themselves cobbling together a few different grants to fund an initial prototype of a system. This approach can help to generate additional organizational support, as more programs have a stake in the custom system’s success. It’s also important to consider how and where a custom solution may fit into the organization’s overall strategy – if there is alignment, additional support across various teams could also help to ensure the project’s success.

Custom solutions require custom skills

Managing the custom solution project requires technical skillsets and a good understanding of the software development cycle. Without someone to line up necessary resources, make day-to-day decisions, and provide strategic guidance and technical oversight, a project can easily lose direction and the end result may not meet your needs. It may be helpful to collaborate with other teams within your organization, or to hire a vendor or consultant who can provide project management services.

Session participants also recommended building custom solutions using an iterative or “Agile” approach; this means that smaller sets of functions are tested at various points in the process to collect more useful and direct feedback. An Agile software development approach can help to identify the highest priority features so that a team can spend their time and energy working on what’s important. If this is your organization’s desired approach, it’s important to work with a team that has the right experience working in this way.

Some very helpful guidelines can be found in the Principles for Digital Development, which describes principles for building ICT tools and is a great way to approach building any custom solution.

Caktus GroupModified Preorder Tree Traversal in Django

Hierarchical data are everywhere, from product catalogs to blog post comments. A classic example is the tree of life, where kingdoms are subdivided into a hierarchy of phylum and class down to genus and species. What if you wish to store this data in a database table, which is inherently flat? Databases do not natively store hierarchies, so you need to work around that.

MPTT, or modified preorder tree traversal, is an efficient way to store hierarchical data in a flat structure. It is an alternative to the adjacency list model, which is inefficient. This post will cover the MPTT approach, examine its tradeoffs, then explore django-mptt, which is a package for adding MPTT to your Django models.

MPTT Structure

The MPTT approach adds a lft and a rght attribute to a model, which lets you easily determine parent-child relationships. See the example tree with lft and rght values below (GlobalCorp, for example, has a lft value of 1 and a rght value of 20). The dotted line in the image above shows the path taken to calculate child relationships within the tree.

The model attributes allow you to do SQL queries such as the one below to get all USACorp subsidiaries:

SELECT * FROM example_tree WHERE lft BETWEEN 2 AND 10;

I've used MPTT in two of my major projects at Caktus. A couple of model patterns which might suggest using a MPTT are:

  1. Many class names describing the same basic content type in an arbitrary hierarchy:
https://caktus-website-production-2015.s3.amazonaws.com/media/images/All/fake_model_tree1.png
  1. Awkward containment structures where you are creating a bunch of linked models that are basically different words for the same model:
https://caktus-website-production-2015.s3.amazonaws.com/media/images/All/fake_model_tree2.png

MPTT Tradeoffs

The MPTT approach is beneficial in that, with only a couple of fields you can determine an entire tree structure. Because of this economy, retrieval operations are efficient. You can very quickly query a tree to determine its relationships. The tradeoff is that inserts and moves are slow. If the structure of the tree is constantly changing, MPTT is not a good option because it needs to update many or all of the records. It also performs a whole table lock which prevents any updates to the affected table. This is obviously less than ideal if you have a heavily updated database.

django-mptt

The django-mptt project is a convenient way to incorporate MPTT into Django. It provides a base model, MPTTModel, which implements the following tree fields for you:

  • level (indicating how deep in the tree a node is)
  • lft
  • rght
  • tree_id (to identify tree membership for any given instance)

It also provides the following convenience methods which abstract these tree fields and help you to manage the tree:

  • get_ancestors(ascending=False, include_self=False)
  • get_children()
  • get_descendants(include_self=False)
  • get_descendant_count()
  • get_family()
  • get_next_sibling()
  • get_previous_sibling()
  • get_root()
  • get_siblings(include_self=False)
  • insert_at(target, position='first-child', save=False)
  • is_child_node()
  • is_leaf_node()
  • is_root_node()
  • move_to(target, position='first-child')

Of these methods, get_root() and insert_at() are particularly helpful. Manually modifying lft and rght values is not a good idea, and insert_at() is a safe way to update the tree. I use get_root() all the time to double check or even short circuit child values. For example, if you have five product types, you could have five trees and specify all the product related information at the root of each tree. Then any child node could simply ask for its root’s values:

product = Products.objects.get(id=123)
product.get_root().product_color

Example Class

from mptt.models import MPTTModel, TreeForeignKey

class Company(MPTTModel):
    name = models.CharField(max_length=255)
    parent = TreeForeignKey('self',
                              related_name='client_parent',
                              blank=True,
                              null=True)
    is_global_ultimate = models.NullBooleanField()
    is_domestic_ultimate = models.NullBooleanField()

Using the tree

This example method finds the domestic headquarters for any given company in a tree.

def get_domestic_ultimate(self):
    """
    If current company is flagged as domestic ultimate, return self.
    Otherwise iterate through ancestors and look for the
    first domestic ultimate.
    """
    if self.is_domestic_ultimate:
        return self
    mytree = self.get_ancestors(ascending=True, include_self=False)
    for comp in mytree:
        if comp.is_domestic_ultimate:
            return comp
    return None

Setting up a Test Tree

class CompanyTestCase(TestCase):
    def setUp(self):
        self.globalcorp = factories.CompanyFactory(name="GlobalCorp",
                                                   is_global_ultimate=True,)
        self.usacorp = factories.CompanyFactory(parent=self.globalcorp,
                                                   is_domestic_ultimate=True,
                                                   name="USACorp")
        self.companya = factories.CompanyFactory(parent=self.usacorp,
                                                   is_headquarters=True,
                                                   name="Company A")
        self.companyb = factories.CompanyFactory(parent=self.usacorp,
                                                   name="Company B")

Testing the Tree

def test_tree_parameters(self):
    self.assertEqual(self.globalcorp.lft, 1)
    self.assertEqual(self.globalcorp.rght, 20)
    self.assertEqual(self.globalcorp.level, 0))
    self.assertEqual(self.asiacorp.lft, 10)
    self.assertEqual(self.asiacorp.rght, 19)
    self.assertEqual(self.asiacorp.level, 1)
    self.assertEqual(self.globalcorp.get_descendant_count(), 9)
    self.assertEqual(self.usacorp.get_descendant_count(), 3)
    self.assertEqual(self.asiacorp.get_descendant_count(), 4)

Treenav

We find MPTT very useful here at Caktus and have even created a product that integrates with django-mptt called django-treenav. This product is an extensible, hierarchical, and pluggable navigation system for Django sites.

One Last Gotcha

There is an incompatibility with django-mptt and GIS in PostGreSQL. If you are using django-mptt and CREATE EXTENSION postgis;, you can't use MPPTMeta attributes in your MPTTModels:

class MPTTMeta:
    level_attr = 'mptt_level'
    order_insertion_by=['name']

Adding these meta options won't do anything obviously wrong. It will cheerfully report an incorrect tree structure.

Tim HopperEverything a Stranger Said About My Height in 2015

Caktus GroupWhat We Open Sourced in 2015: A New Year's Retrospective

This year we had the pleasure of building a number of unique solutions for several organizations. In addition, we had the support of these clients to open source the tools we built. By open sourcing our work, we enable others to use, replicate, and even improve upon the tools we’ve created.

With the year coming to a close, we thought we would take a moment to reflect on the projects we’ve had the opportunity to open source this year.

Commodity Tracking System, International Rescue Committee (IRC)

On behalf of the IRC, Caktus developed significant upgrades to a previous iteration of the Commodity Tracking System (CTS). The fedex-style systemove enables IRC employees to reliably track the shipment of humanitarian aid to Syrian refugees across Jordan, Turkey, and Syria. To help handle increasing volume of shipments the IRC oversees, we created a cohesive web-based application that links information from an array of technologies to track and verify delivery of aid shipments. The enhanced application enables the IRC to quickly create and customize large shipments, generate custom barcodes for each shipment for scanning purposes, and pinpoint these shipments through a mobile data collection system. Finally, the system enables the IRC to easily report the delivery status of a donor’s aid donations, thereby providing the accountability and transparency donors desire.

Service Info, International Rescue Committee (IRC)

In addition, we worked with the IRC on another web-based app called Service Info, which helps Syrian refugees in Lebanon to identify, locate, and utilize the various aid service available to them. Aid providers can self-register on the site to promote their various services and users of the site can search for these services on an easily navigable interface. More importantly, the platform provides a space for users to comment on and rate these services, the hope being that such feedback will, in turn, improve the quality of service. In addition, Service Info will continue to offer a localized network of vetted, rated and reliable service organizations even after larger non-governmental organizations like the IRC may have to move their operations elsewhere.

Smart Elect, Libya’s High National Elections Commission (HNEC)

After building an SMS voter registration app—the first of its kind—that enabled 1.5 million Libyans to register to vote across two national elections, we had the privilege of further working with the Libyan High National Elections Commission (HNEC) and the United Nations Support Mission to Libya to open source the SmartElect platform. The tools in this elections management platform range from SMS voter registration, to bulk alerts to voters, and call center support software.

Ultimate Tic Tac Toe, Internal ShipIt Day Project

Caktus’ Ultimate Tic Tac Toe game started out as an internal ShipIt Day project built in AngularJS by developers Victor Rocha and Calvin Spealman with initial design help from Wray Bowling. Several iterations later it is a fully fledged game with design by Trevor Ray, AI components by Jeff Bradberry that have received recognition from Hacker News, and challenging gameplay that captivated visitors with an interactive touch screen at our booth at both PyCon and DjangoCon.

django-treenav and django-scribbler, Contributions from our Open Source Fellow

Inspired by the Django Software Foundation’s fellowship as well as the Two Day Manifesto, we launched our own Open Source Fellowship this year. As part of that program, Ben Phillips worked on significant improvements and upgrades to the code for Caktus built django-treenav, an extensible, hierarchical, and pluggable navigation system for Django sites as well as django-scribbler an application for editing and managing snippets of text and HTML on the front-end through custom template tags.

We are thrilled to have had the opportunity to work on such a wide range of projects this year, and even more so to have been able to open source the code for some of our work, especially those projects in the international development sector. We’re truly excited to see what 2016 will bring, and to continue our efforts to support and extend the influence and potential of open source tech solutions.

Og MacielEnd of Year - 2015

Review of 2015

Another year has gone by and I guess it is time to review the things I set out to do and grade myself on how well (or poorly) I fared. Here are some of my goals for 2015:

Read 70 Books

Grade: PASS

Even though I had a very, very busy year at work, with many releases of Red Hat Satellite 5 and Red Hat Satellite 6 shipped to our customers, I managed to surpass my goal of reading 70 books, finishing the year with a whopping 79 books read! You can see the books I read here: Year in Books

This year I also spent a good chunk of my time looking at old, used books, and my personal book collection increased considerably. At one point I had so many piles of books lying around the house that I had to buy 4 new book cases to store them. At first I wanted to have them custom made, but the estimates I got from 3-4 different people were way out of my budget. In the end I went with 4 Billy Bookcases from Ikea, which cost me about 10 times less!

If you want to see what I'm reading or want to recommend a book which you think I might enjoy reading, please feel free to add me on GoodReads.

Caktus GroupReflecting on My Time as Caktus' Open Source Fellow

My name is Ben Phillips and I am Caktus' Open Source Fellow. As my fellowship comes to a close, I wanted to reflect on my time at Caktus and to share my experience and some of what I've learned here. First, however, I should probably share how I ended up here in the first place.

Initial Commit

Six months ago I was a biologist. That had been the plan all along, after all. It had taken me quite a while. I'd had to work as a barista, stream surveyor, machinist/welder, and mouse husbandry technician and to get a bachelor's degree in biology, but I eventually achieved my goal and found a job as a research technician in a molecular/cell biology lab. Unfortunately, as is often the case with the plans of mice and men (both of which, as a former biologist and mouse technician, I am qualified to speak about), there was a small problem. It slowly became clear to me that this was not what I wanted to do with my life. The research was interesting, but the work was not. I was at something of a loss until one day when, in the middle of a particularly tedious task, I began to wonder if I could figure out how to write some kind of program to help make my task a bit easier. Though it turned out that integrating a computer vision library into a program to track the movements of mice spinning in circles was a bit too ambitious for a first project, I had found something interesting and challenging that I really enjoyed. A few months later, I quit my job to enroll in The Iron Yard, a local code school, and soon found myself with the technical skills of a junior Python developer. Now what? The answer, of course, was Caktus.

Getting Started

I found out about Caktus when I attended the local Python developer meetup they host every month in the Caktus Tech Space and began to learn about them as a company and as people. I was immediately attracted to both the exciting work they do and their focus on giving back to the local and global community. After graduating from code school, I quickly applied to their fall internship position and after one of the friendliest interviews I've ever had and an anxious week or two of waiting I found myself joining the Caktus team as their new Open Source Fellow. The aim of my internship was for me to make contributions to open source projects I found interesting or useful, thereby giving back to that community, which is so vital to Caktus and to developers in general. Where does one start, however, with such a huge breadth of possibilities? For me, the journey began with Django-Treenav.

Contributing

Treenav is a project created by some of the Caktus team that is designed as “an extensible, hierarchical, and pluggable navigation system for Django sites”. As a fairly well-established, older project, Treenav provided an ideal starting point for me. Most of the work required was updating and removing deprecations throughout the project. With some direction from my incredibly helpful coworkers, I got into a flow of finding and updating deprecated areas of the codebase which, as a relatively simple but very important task, allowed me to quickly add value to the project and gain confidence in my abilities as a developer. It also helped teach me the fundamentals of contributing to a project: opening and responding to issues, making discreet and focused pull requests, having code reviews, and maintaining good communication. A string of updated deprecations and even a couple of bug fixes later, I was able to help publish the 1.0 release of Treenav which was a huge milestone for me. I had helped contribute to a tool that was available for people to use in projects all over the world. So cool.

Refactoring

My next project was another Caktus original: Django Scribbler. Scribbler is an application for editing and management of snippets of text and HTML on the front-end through a few custom template tags. I started out working on this project much the same as I did with Treenav, updating deprecations and the like. But with the guidance of my mentors and coworkers, I began to tackle more challenging work, such as fixing bugs and even starting to add some features, like support for multiple template engines.

Additionally, I took my first real dive into testing. I had gotten some experience with test driven development and writing tests in code school, but had never worked within an established test suite before nor had any experience with tools like Selenium, which was a necessity for such a JavaScript-heavy project. By far the largest task I undertook working on Scribbler, however, was revamping its module-loading/package management system. The project relies on a number of JavaScript libraries and when I started working on Scribbler these libraries were all managed with RequireJS. While I was working on updating some of the JS dependencies, we decided to migrate to using Browserify for that purpose instead. Throughout the process of this migration, I learned not only how to convert from RequireJS and an AMD style loading system to Browserify and its Node-style bundling, but I learned why that was a good decision for the project, observing the sorts of discussions and considerations that go into making such design choices for a project. RequireJS and Browserify accomplish much the same thing in very different ways and we had to decide which fit best with our project from not only a technical standpoint but also from that of a design philosophy. In the end, my work on the conversion had helped streamline the entire package management and build process for the project and, just as importantly, I understood why it was better.

Deployment

Everything I had worked on up until this point, though open source, had originated from Caktus, and an important purpose of my internship was to give back to the open source community at large— a purpose I strongly believe in. As such, while I was wrapping up work on the Browserify transition and a few smaller features for Scribbler, I began to look around for other projects I could contribute to. It was a daunting prospect and, to be perfectly honest, one that I was initially unsure I was ready for. However, as I began to canvas issues on projects and communicate with other developers and get positive feedback from my coworkers, I realized I had a lot more to offer than I had previously thought. There were issues I had experience with, discussions I could weigh in on, and bugs I could dig in on, figure out, and fix. The codebases were unfamiliar and often large and imposing, but I found I could understand a lot as I started to look over them. Before long, I had pull requests out on a number of projects, including some on Django itself. I often had to heavily refactor or completely rewrite my contributions based on comments I received, but instead of being discouraged I found myself actually enjoying learning how to adjust my code based on feedback and improving what I had written until it was something I could be proud of. There is a great sense of satisfaction that comes from that feeling of partnership one gets working with other developers on an open source project, just as much as there is in actually having content you wrote merged into a project and knowing you've helped make a project better for everyone who uses it. As of the writing of this article, I have had the privilege of having contributions merged into several projects, including Django Rest Framework and Django, an accomplishment that would not have been possible for me even a few short months ago.

Release Notes

My time here at Caktus has been brief, but so incredibly rewarding. With the help of my mentors and coworkers, I have made huge leaps in both my technical skills and my confidence in my ability to add value to projects. Beyond even those technical skills, by working with and watching the people here at Caktus I have begun to learn some of the even more important, if less tangible, developer skills. I have learned not to just fix problems I come across, but to dig in and understand them first, and to ask questions quickly when I'm stumped. I have also seen the importance of approaching development not just as writing code but as crafting projects, building a whole rather than a series of parts. These are the sorts of skills that cannot be easily taught, but the people working here at Caktus have been exemplary teachers and examples. You would be hard-pressed to find a more knowledgeable, helpful, and friendly group of people. At no point in my internship did I ever feel like an intern. Everyone always treated me with respect as a true developer and member of the team, and I will always appreciate that. As my internship comes to a close and I look towards my next steps, I know that I will always look back on my time at Caktus with fondness and gratitude. I have learned so much and am a stronger developer than when I started. And I have gained a love for open source that I doubt I will ever lose. I have developed new skills, new tools, new confidence, and new friends. So if you are considering an internship at Caktus, I cannot recommend it enough. You'll never regret it.

Caktus GroupCaktus Participates in Tree of Bikes at American Tobacco Campus

This year, our neighbors at American Tobacco Campus (ATC) hosted Durham’s first ever Tree of Bikes event to help collect and distribute new bikes to children in need in the local community. The tree is a 25-foot tall sculpture made entirely of new children’s bicycles donated by individuals and local businesses in the Triangle. The bikes are then distributed to children living in the Cornwallis Road affordable housing community managed by the Durham Housing Authority.

Having heard about the drive in a tweet from ATC, Caktus employees were incredibly excited to participate. We all chipped in to buy a beautiful and sturdy little bike from our next door neighbors at Bullseye Bicycle. The great team at Bullseye made sure the bike was ship-shape for immediate riding! In addition, they were kind enough to provide a voucher for free tuneups and care on the bike for one year after purchase.

In conjunction with Habitat for Humanity and Happy Roots Entertainment, the bikes were collected at ATC throughout the month of November. The organizers reached their goal of collecting 120 bikes for the tree, which was lit in a ceremony held on December 4th. Though Happy Roots has been distributing bikes to children in affordable housing communities for the past 7 years, the number collected for Tree of Bikes will enable Happy Roots to provide bikes to ten times as many children than in the past.

Tree of Bikes is on display at ATC until December 18th, when they will be distributed to their new owners. Be sure to check out the tree while you can!

Caktus GroupCaktus CTO Colin Copeland Helps Launch Open Data Policing Website

Today, at Caktus headquarters, CTO and co-founder of Caktus Colin Copeland will stand at a press conference along with activists, police representatives, and elected officials to announce the launch of OpenDataPolicingNC.com. The first site of its kind, OpenDataPolicingNC.com draws on public records to publish up-to-date stop, search, and use-of-force data—broken down by race and ethnicity—for every police department and officer in the state of North Carolina. The volunteer effort, led by The Southern Coalition for Social Justice (SCSJ) and technical leadership by Colin, includes approximately 20 million anonymized data points from 15 years of NC traffic stop data.

Colin’s development team for the project included fellow Durham residents, data scientist Andy Shapiro and software engineer Dylan Young. Combined, the team holds over 29 years of software development experience. Working outside their normal, full-time jobs, Colin, Andy, and Dylan worked iteratively to build, test, and open source the new site. In addition, staff attorney of SCSJ, Ian Mance used his understanding of the needs of police agencies, lawyers, and government officials to ensure a user-friendly site.

“Traffic stops are the most common way citizens interact with police officers,” said Ian. “This site enables anyone who engages with these issues—whether they be police chiefs, courts, lawyers, or policymakers—to ground their conversation in the facts.” Today’s site’s launch is the exciting culmination of a long period of collaborative work between the sites development team, SCSJ, and the Fayetteville PD. This past summer, SCSJ presented a beta version of OpenDataPolicingNC.com to the White House, the initial step to eventual partnership with the Fayeteville PD and its Chief, Harold Medlock, members of the White House’s Police Data Initiative. “The future of data-driven justice lies in collaboration between the tech community, nonprofits, and government services.”

That collaboration doesn’t end with the site’s launch. The Fayetteville PD and Chief Medlock will continue working with the OpenDataPolicingNC team to build new features for the site as well as promote increased police transparency. In addition, the team hopes to replicate OpenDataPolicingNC.com for other states.

Gary PosterElixir: Erlang records and the Erlsom XML library

I wanted to jot down some notes about Elixir, because I'm learning it, and because some of the pieces I assembled for my most recent code exercise were hard to find across the web. Hopefully it will help someone else, or at least be something I can refer to again in the future. I was playing around implementing the last exercise from Chapter 13 of Dave Thomas' Programming Elixir book: get the

Tim HopperI Love Twitter

Caktus GroupCyber Monday: 50% off Django book and videos

Are you looking for a good gift for a current or future Django developer? Check out Caktus technical director Mark Lavin's work for O'Reilly:

Both are now 50% off at the O'Reilly site with the discount code CYBER15.

Tim HopperMy Python Environment Workflow with Conda

Many new Python programmers rely on their system install of Python to run their scripts. There are several good reasons to stop using the system Python. First, it's probably an old version of Python. Secondly, if you install 3rd party packages with pip, every package is installed into the same globally accessible directory. While this may sound convenient, it causes problems if you (1) install different packages with the same name (2) need to use different versions of the same package (3) upgrade your operating system (OS X will delete all the packages you have installed).

For many years, best practice for Python developers was to use virtualenv to create a sandbox-ed environment for each project. If you use virtualenv, each project you work on can have its own version of Python with its own 3rd party packages (hopefully specified in an requirements.txt file). In my experience, getting started with virtualenv is cumbersome and confusing; to this day, I have to look up the command to create a Python 3 virtualenv.1

In 2015, I have almost exclusively used Python installations provided through Continuum Analytics's Conda/Anaconda platform. I have also switched from using virtualenvs to using conda environments, and I am loving it.

Before explaining my workflow, here's a quick glossary of the similarly-named products that Continuum offers.

  1. conda: "Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software."2 A conda install provides a whole suite of command line tools for installing and managing packages and environments. Because conda works for any software, it can even install different versions of Python (unlike pip).
  2. Anaconda: "Anaconda is a completely free Python distribution (including for commercial use and redistribution). It includes more than 300 of the most popular Python packages for science, math, engineering, and data analysis." It is available across platforms and installable through a binary.
  3. Anaconda Cloud: Also known as Anaconda.org and formerly known as Binstar, "Anaconda Cloud is a package management service where you can host software packages of all kinds." Anaconda Cloud is a package repository analogous to PyPI. Packages are installed via the conda command line tool instead of Pip. By default, the conda install command installs packages from a curated collection of packages (a superset of those in Anaconda). Continuum allows users to host their own packages on Anaconda Cloud; these packages can also be installed through conda install using the -n flag with the username.

Conda, Anaconda, and Anaconda cloud are distinct but interrelated tools; keeping them straight can be hard, but is helpful.

Conda (the package manager) can be installed in two ways. Through the Miniconda installer or the Anaconda installer. Both install the package manager, but the latter also installs the 300+ packages for scientific Python. (Installing Anaconda is equivalent to installing Miniconda and then running conda install anaconda.)

Conda Environment Files

It has become standard for pip users to create a requirements.txt file for specifying dependencies for a particular project. Often, a developer working a project will (1) create and activate a virtual environment (2) run pip install -r requirements.txt to build an isolated development environment with the needed packages.

Conda provides an analogous (but more powerful) file: environment.yml.3

A simple environment.yml file might look like this:

name: numpy-env
dependencies:
- python=3
- numpy

If you are in a directory containing this file, you can run $ conda env create to create a Conda environment named numpy-env that runs Python 3 and has numpy installed4. Run $ source activate numpy-env to activate this environment. Once activated, running $ python will run Python 3 from your environment instead of the globally installed Python for your system. Moreover, you will be able to import numpy but not any of the 3rd party packages installed globally.

environment.yml can also install packages via pip with this syntax:

name: pip-env
dependencies:
- python
- pip
- pip:
    - pypi-package-name

I see environment.yml files as a positive development from requirements.txt files for several reasons. Foremost, they allow you to specify the version of Python you want to use. At Pydata NYC 2015, many presenters provided their code in Github repositories without specifying anywhere whether they were using Python 2 or 3. Because I included a YAML file, attendees could see exactly what version I was using and quickly install it with conda env create. I also like being able to specify the name of the environment in the file; this is particularly helpful when working with others. Finally, because conda can install from PyPI via pip, environment.yml files provide no less functionality than a requirements.txt file provides.

My Python Environment Workflow

Lately, whenever I am working on a new project (however big or small), I follow the following steps:

  1. Create a project folder in the ~/repos/ directory on my computer.
  2. Create an environment.yml file in the directory. Typically the environment name will be the same as the folder name. At minimum, it will specify the version of Python I want to use; it will often include anaconda as a dependency.5
  3. Create the conda environment with $ conda env create.
  4. Activate the conda environment with $ source activate ENV_NAME.
  5. Create a .env file containing the line source activate ENV_NAME. Because I have autoenv installed, this file will be run every time I navigate to the project folder in the Terminal. Therefore, my conda environment will be activated as soon as I navigate to the folder.
  6. Run $ git init to make the folder a Git repository. I then run $ git add environment.yml && git commit -m 'initial commit' to add the YAML file to the repository.
  7. If I want to push the repository to Github, I use $ git create using Github's hub commands. I then push the master branch with $ git push -u origin master.

As I add dependencies to my project, I try to be sure I add them to my environment.yml file.

A major benefit of all this is how easily reproducible a development environment becomes. If a colleague or conference attendee wants to run my code, they can setup the dependencies (including Python version) by (1) cloning the repository, (2) running $ conda env create, (3) running $ source activate ENV_NAME. It's easy enough for me to drop those instructions and further instructions for running the code in a README file. If I'm feeling especially helpful, I'll create a Makefile or Fabfile to encapsulate commands for core functionality of the code.

An even larger benefit is that I can return to a project after, days, months, or years and quickly start developing without first having to hunt for print statements to figure out whether I was using Python 2 or 3.

I've come to love environment.yml files, and I think you might too.


  1. virtualenv also provides no helping in actually managing Python versions. You have to install each version yourself and then tell virtualenv to use it. 

  2. From the conda docs

  3. Though there is currently a pull request for adding requirements.txt support to conda: https://github.com/conda/conda-env/pull/172

  4. Numpy will be installed from a binary from Anaconda Cloud, not built from source. 

  5. I created a bash command conda-env-file to automatically create an environment.yml file named after the current directory. 

Tim HopperSequential Minimal Optimization Algorithm for Support Vector Machines

In my nonlinear optimization class in grad school at North Carolina State University, I wrote a paper on the famed SMO algorithm for support vector machines. In particular, I derive the Lagrangian dual of the classic formulation of the SVM optimization model and show how it can be solved using the stochastic gradient descent algorithm.

You can find the paper here.

Caktus GroupWhat human-centered design can do for international development

Cross-posted with Creative Associates International. Written by Gina Assaf (Creative Associates International) and Tania Lee (Caktus Group). Image courtesy of Creative Associates International.

A human-centered design process is a critical step that is often overlooked when making important decisions in technology for development.

In the development space, funding usually comes from a donor or an agency, and the service or products it benefits goes to a different place, local communities in need around the world. The indirect link between the donors and the beneficiaries can lead to a lack of accountability—as Dave Algoso explains in his blog on Reboot.org.

But practicing human-centered design, or user-centered design, which puts the needs, wants and limitations of users first in each stage of the design process, can help ensure that the services and products actually meet the needs of the end users—the communities we seek to empower and serve.

For example, funding a software product that requires high bandwidth connectivity when the population it is meant to serve has minimal Internet connectivity will not benefit that community. Similarly, a mobile banking tool utilizing lots of text may not be appropriate for low-literacy communities.

Not practicing human-centered design could mean wasting funding on initiatives that simply aren’t effective and losing collaborative opportunities for designing better solutions.

At this year’s UXDC conference, organized by the DC Chapter of the User Experience Professionals Association, technologists, development practitioners and design professionals gathered to discuss “Lessons and Challenges in Implementing User Centered Design and UX Practices in the International Development Sector.” Human-centered design is gaining more and more traction in the international development realm.

Is international development ready for human-centered design?

International development funding mechanisms are mainly donor-driven, and convincing donors to invest in user research, user experience design or iterative prototyping (all important tenets of human-centered design) can be a tough sell. Funding models in development are not ideally set up for iteration, a key component for human-centered design.

Development funds are often taxpayer-funded and somewhat restricted in how they can be spent. Investing in a process that may require more upfront work and iteration may be considered wasteful.

The Request for Proposal process is set up in a way that assumes we have enough information to make decisions when pitching a proposal, and the turnaround timelines are very short. This is challenging for proposing and implementing technology innovations since it leaves little room for a human-centered design process, which requires more upfront thought, needs assessment and iteration.

To add to the hurdles, there is a lot of the confusion in the vocabulary between the user experience community and the development and humanitarian communities. The two groups speak different languages. For example, humanitarian and development communities use terms like “access,” “capture,” “accountability,” “dignity” and “livelihoods.” User experience communities use terms such as “co-design,” “user-centered,” “journey mapping” and “need-finding.”

But there is room for finding common ground, if not a common language.

Some parts of the human-centered design process are already implemented (to some degree) in development but described and worded differently.

For example, the “capture” phase in development is when “need-finding” occurs in design. “Need-finding” is an essential part of human-centered design, where we understand the needs, limitations and concerns of the target users. The “capture” phase in development performs a similar task, but could be further enhanced by making the beneficiary more central to this research and make use of some of the design research method tools used in the human-centered design process.

Importance of participatory design

Despite the challenges, there are many opportunities to promote and improve on the use of human-centered design practices within development.

For example, participatory design attempts to actively involve all stakeholders (e.g. employees, partners, customers, citizens, and end users) in the design process to help ensure the result meets their needs and is usable. This directly relates to the first principle of the Principles for Digital Development: “Design with the user.” A participatory design process builds collaboration across various internal teams that often work in silos: programs, technology, grants, procurement, and more.

However, “design with the user” isn’t always as easy.

Within development and humanitarian spaces, often the user or the person most affected could be a part of an extremely vulnerable group or hard to reach, and working directly with extremely vulnerable groups is not always possible or advised. In addition, funding restrictions may inhibit participatory design. For example, some U.S. government agencies won’t allow more than 10 external persons to participate in projects due to funding restrictions.

Also, there are multiple groups of stakeholders in development: community members, community leaders, development staff, development staff managers, donors, and more. Balancing and prioritizing needs is a challenge.

Bridging the development-user experience divide

There is a lack of understanding across user experience professionals and international development communities—a fact confirmed by panelists and participants at the UXDC session. But it is not an unbridgeable gap. More conversations and discussions around this topic are needed and panels, like the one at UXDC, help bridge this gap.

While there are many obstacles to implementing the entire human-centered design process and methods in international development projects, there are positive trends—including the emergence of development labs and innovation hubs with an emphasis on capacity building for teams and human-centered design training. There also seems to be receptiveness and an eagerness among local innovators in developing countries to move forward with human-centered design.

For example, Creative Associates International worked in Honduras using a human-centered design approach, and performed research in partnership with the local field staff, to understand how the computer lab could better serve the wants and need of the community.

Creative trained the local Honduran staff and performed user interviews together, synthesized the results and brainstormed solutions together. Creative is now in the process of testing and iterating some of the proposed solutions. The team there is very positive and enthusiastic about this process, and they are leading it forward and are seeing how the work they are doing can make a big impact.

Caktus Group offers similar services to clients and recently conducted a discovery workshop in Turkey with a humanitarian NGO serving local refugee communities. The discovery workshop was conducted with roughly 15 client staff and utilized a series of brainstorming, mapping, outlining, crowdsourcing and voting activities to develop terms of success, user personas, journey maps and defined a minimum viable product.

Caktus has since been working with this client to co-design and iteratively build a custom web application that fits their unique needs.

Human-centered design principles are already being utilized in many development programs, but there are many more opportunities for improvement. Better metrics and further discussion and collaboration of human-centered design practice in development can help to address the challenges, demonstrate value and achieve lasting results in people’s lives.

*Gina Assaf is a Product and User Experience Manager with the Technology for Development team at Creative Associates International. Tania Lee is a SMS & Web Product Strategist at Caktus Group.

For more information about the UXDC community or to join the human-centered design working group, please contact. Gina Assaf, ginaa@creativedc.com; Tania Lee,tlee@caktusgroup.com; or Ayan Kishore, ayank@creativedc.com.*

Tim HopperKnitting

True story: I'm a closet knitter. I don't have much time for it these days, but it helped keep me sane in grad school. Here are some things I've made over the years.

knitting1.JPG

knitting2.JPG

knitting3.jpg

knitting4.jpg

knitting5.JPG

Caktus GroupInitial Data in Django

I've struggled to find an ideal way to load initial data for Django projects. By “initial data,” I'm referring to the kind of data that you need on a new system for it to be functional, but could change later. These are largely lists of possible choices, such as time zones, countries, or crayon colors.

Here are my requirements:

  • Fairly simple to run on initial server deploy, initial development environment setup, and when starting a test run.
  • Does not risk overwriting changes that are made to records in the live database after they're initially created.
  • Not too hard to update from the current live data, so that future new deploys etc get the latest data
  • Copes well as models evolve, because they will
  • Well supported by Django
  • Not too great a performance impact on testing

Here are some of the approaches that I've tried.

Fixtures

Fixtures are how Django used to recommend loading initial data.

Pros:

  • It's fairly easy to update the fixtures as the "initial" data evolves - e.g. you've added more options in your live server, and want to preserve them in the initial data, so just do another dumpdata.
  • Fixtures don't slow down test startup if they're not named "initial.XXXX" or you're using a recent version of Django, because they don't get loaded automatically.
  • Easy enough to load at the beginning of tests that need them by adding a fixtures attribute to the test case class.

Cons:

  • fatal - If a fixture is loaded again, it overwrites any changed data in the database with the original values
  • Discouraged by current Django documentation
  • Hard to keep valid when models evolve. The right way would be every time a model changes, you update the fixtures from the current data, then create a fresh temporary database without applying the new migrations, load the current fixtures, apply the new migrations, and make a fresh dump of the initial data. But that’s a lot of work, and hard to remember to do every time models change.
  • Data is not automatically available during tests, and since our system won't run correctly without some of this data, you have to arrange to load or create it at test setup.
  • Not loaded automatically so:
  • When setting up new development environments, you must document it and it’s still easily overlooked, or else get a developer to run some script that includes it
  • For automated deploys, not safe to run on every deploy. Probably the only safe approach is to run manually after the first deploy.

Summary: rejected due to risk of data loss, inconvenience during development, and negative recommendation from Django documentation.

Fixture hack

I played around with a modified loaddata command that checked (using natural keys) if a record in the fixture was already in the database and did not overwrite any data if the record had previously been loaded.

This means it's safer to add to scripts and automatic deploys.

Pros:

  • Fairly easy to update as "initial" data evolves - e.g. you've added more options in your live server, and want to preserve them in the initial data, so just do another dumpdata
  • Fixtures don't slow down test startup if they're not named "initial.XXXX" or you're using a recent version of Django, because they don't get loaded automatically
  • Easy enough to load at the beginning of tests that need them by adding a fixtures attribute to the test case class.
  • Can add to env setup scripts and automated deploys safely

Cons:

  • Hard to keep valid when models evolve
  • Data is not automatically available during tests
  • Not loaded automatically, so when setting up new development environments, you must document it and it’s still easily overlooked, or else get a developer to run some script that includes it

Summary: rejected; it mitigates one problem with fixtures, but all the others remain.

Post-migrate signal

Something else I experimented with was running code to create the new records in a post-migrate signal, even though the docs warn against data modification in that signal.

Pros:

  • Runs automatically each time migrations are run, so will automatically get run during most automated deploys
  • Runs automatically when tests are setting up the test database, so all tests have the data available - but is part of the initial database, so we don't have the overhead of loading initial data during every test's setUp.

Cons:

  • fatal - Runs every time migrations are run, even reverse migrations - so it runs when tables are in the wrong state and breaks development when you might be migrating forward and back
  • If it fails, the whole migration fails, so you can't just ignore a failure even if you didn't care about creating the initial data that time
  • Slows down database creation when running tests, unless you use --keepdb

Summary: rejected; not a valid way to load initial data.

In a migration

Add a migration that creates the initial records.

Pros:

  • This is what the Django documentation currently recommends
  • Runs automatically
  • The migration only runs when the database schema matches what it was when you wrote it, so it won't break as models evolve
  • You can write it to ignore records that already exist, so it won't overwrite later changes in the database

Cons:

  • fatal in some cases - migrations don't use the actual model class, so models with custom behavior (like MPTTModel) won't get created correctly. You might be able to find workarounds for this on a case-by-case basis.
  • Slows down database creation when running tests, unless you use --keepdb
  • Harder than fixtures to update as the initial data evolves. Options:
    • Go back and edit the original migration - but then it won't run on existing databases and they won't get the new records
    • Add a new migration that adds the whole updated initial data set, then go back and comment out the code in the previous initial data migration since there's no point running it twice on new database setup
    • Add yet another migration for just the new data - probably the simplest in terms of updating the migrations, but it'll be harder to extract just the new data from the current database than to just extract the whole dataset again. Also, you don't preserve any edits that might have been made over time to older records.

Summary: best option so far. It has some drawbacks, but not as bad as the other options.

Conclusion

The best approach in most cases is probably to load initial data in a migration, as the Django documentation recommends. It's not perfect, but it avoids some of the fatal flaws of other approaches. And the new (in Django 1.8) --keepdb option helps ameliorate the slow test startup.

I'm still curious if there are other approaches that I haven't considered, though.

Tim HopperMy First Publication

When I worked at RTI International, I worked on an exploratory analysis of Twitter discussion of electronic cigarettes. A paper on our work was just published in the Journal of Internet Medical Research: Using Twitter Data to Gain Insights into E-cigarette Marketing and Locations of Use: An Infoveillance Study.1

Marketing and use of electronic cigarettes (e-cigarettes) and other electronic nicotine delivery devices have increased exponentially in recent years fueled, in part, by marketing and word-of-mouth communications via social media platforms, such as Twitter. ... We identified approximately 1.7 million tweets about e-cigarettes between 2008 and 2013, with the majority of these tweets being advertising (93.43%, 1,559,508/1,669,123). Tweets about e-cigarettes increased more than tenfold between 2009 and 2010, suggesting a rapid increase in the popularity of e-cigarettes and marketing efforts. The Twitter handles tweeting most frequently about e-cigarettes were a mixture of e-cigarette brands, affiliate marketers, and resellers of e-cigarette products. Of the 471 e-cigarette tweets mentioning a specific place, most mentioned e-cigarette use in class (39.1%, 184/471) followed by home/room/bed (12.5%, 59/471), school (12.1%, 57/471), in public (8.7%, 41/471), the bathroom (5.7%, 27/471), and at work (4.5%, 21/471).


  1. I have no idea what "Infoveillance" means. 

Caktus GroupThe Long, Hard Haul to Uncovering a Single, Tiny CSS Property Bug in IE10

There’s a very small but devastatingly crash-inducing bug in Internet Explorer 10. Watch out for setting a background-color to inherit on any pseudo element (like ::before and ::after), because this will crash IE completely every single time.

#element::after {
    background-color: inherit; /* CRASH! */
}

Rather than dedicate an entire post to a single CSS property, let’s trace the work that was required to track this down. The payoff is obviously important (a supported browser crashing is a big problem!) but it can feel frustrating when a big issue comes down to something so seemingly tiny. It is important to remember the real measure of the bug is not the number of lines involved in fixing it (one, in our case) but the amount of effort it took to track down. Don’t let the small fix (just give it a color directly) betray the hard work you put into understanding the issue.

First, how did we get here, anyway? A crashing browser is something you hope doesn’t go unnoticed for long.

It is pretty common to try to avoid running full regression tests across a site, both early on and throughout the project. When you have a dozen supported browsers on the low end, the most efficient workflow tends to be getting a lot of feature work and layout under your belt early, and then attacking each browser’s quirks after testing what you’ve built across them. If you try to test the entire supported set of features for every small changeset, especially on a pre-launch product, you’re going to be repeating a lot of work and sinking a lot of hours you could better spend getting more of the initial work wrapped up.

So a couple days of iteration had been done with new feature work and a few bug fixes, and then we found this crash that perplexed us. At that point, it could have been any one of a dozen commits across a number of components. We could have expected some tweaks would be needed to bring Safari and Internet Explorer in line, but the outright crash certainly took us by surprise. We began down a road to narrow the confirmation. We ruled out SauceLabs, the excellent browser testing service, by reproducing it locally on our own setup. IE 11 didn’t crash, and most of the site was just fine except this one page, so we knew it had to be a pretty specific issue.

We had enough to start tracking it down. Being able to reproduce it reliably and only on one browser meant we could really focus any tests we did, so we got started right away.

Having confirmed this is reproducible, how do we start to narrow this down when we aren’t sure when it happened and the crashing nature of the bug precludes us from looking into what’s going on?

While the problem happened on a single page, that wasn’t enough to go on to tell us where it was happening. We had not even discovered that it was a CSS issue at this point, so basically anything was suspect.

We turned to the ever useful Git Bisect command, which every developer should have in their toolbelt. Git Bisect, if you haven’t had an opportunity to learn about it yet, helps you locate a specific commit in your history that caused a bug. You start by telling it the last known commit where you can confirm the bug did not exist and the latest commit (usually the head of your branch or master) where the bug does exist. Git begins narrowing it down, basically stepping halfway between the two known good and bad commits and at each step asking you “What about this commit, is it good or bad?” At the end of the process, which should only take less than half a dozen steps in most cases, you’ll have a single commit where the problem first appears. This is great when you have a problem that went undetected for a while and you need a smaller target to aim for.

The diff in question is CSS only, so now we know the problem is in our CSS! This really helped to narrow it down, both to one aspect of our changes (neither markup nor Javascript seemed to be at fault) and in the number of changes to look closer at (it was not a very large diff). It was, however, much more surprising than a strange Javascript bug might have been. Who ever heard of CSS crashing a browser, after all? The first thought was a parser bug and we poured over the diff looking for errant curly braces or other oddities that might have gone unnoticed and caused this crash. Everything looked flawless, from a syntax perspective, so we were left with the conclusion that it was some actual style causing the failure, rather than a parser issue.

As both a sanity check and a confirmation of our findings so far, we cleared the entire CSS file where these changes were isolated to and pulled it back up in Internet Explorer 10. This confirmed our findings when a very ugly, but non-crash-inducing, version of the page was rendered. We now had clear confirmation that something in this simple CSS file was the culprit, and a quick look at the short commit we had narrowed down as the time when the problem was introduced gave us no good candidates.

There were a few color changes. A bit more padding on one element and a slimmer margin on another. Heck, several of the changes were just fixing indentation in a few places! Nothing stood out.

One by one we commented out the rulesets that had been changed in the diff until we found one that only caused the problem when it was present:

#cssmenu::after {
    position: absolute;
    top: 0;
    left: -50%;
    width: 150vw;
    height: 100%;
    content: "";
    background-color: inherit;
    z-index: -1;
}

This is a standard trick to create a wider background padding around a section that needed to expand beyond the content well. It is just a simple pseudo element with all the necessary properties. These pseudo elements require the position, top, left, width, height, and content properties. This leaves the final two properties, background-color and z-index, as likely causes.

Finding the answer was background-color was only a matter of commenting them each out one by one to test, and seeing this combination from the start of the article was our problem all along:

#cssmenu::after {
    ...
    background-color: inherit;

We had the problem and immediately knew the solution was easy. The color being inherited was just “white” and we had only used “inherit” to remove a minor redundancy. Explicitly stating the color here fixed the problem immediately.

But, it fixed the problem in seconds, while it had taken us nearly an hour to track down the bug between two of us. We had to know more about what had caused us so much headache and followed up with a few experiments.

We had a theory and a false-positive test that this affected ::after but not ::before elements, and quickly learned this affected both varieties of pseudo element.

We dug around as much as we could without sinking time, but could not find any mention of this problem in the usual places people complain about things (StackOverflow, MSDN forums).

If you’ve found this post because you googled the terms “css after background-color inherit crash” then I hope you can breathe a sigh of relief and move on with your day.

For the reader following this story for the journey and not the solution, remember this the next time you feel betrayed by a long debug session that reveals what seems like a small issue: you did not waste your time. The measure of a bug should not be in lines of code or the time to fix the problem once you understand it, but in the effort it requires to uncover, understand, and correct. If we let ourselves be deceived by what false measurements of how big or difficult a bug really is, focusing too much on only the very final and sometimes minor step of fixing it, we let ourselves devalue the very important and difficult work of uncovering complicated problems. And that is where real skill and experience lies, not in the few lines at most it might take to correct.

Footnotes