‘Tis the season for giving, but it can be difficult to find the perfect gift for the special techie in your life. Whether you’re looking for something fun and quirky or something more challenging, here are 20 unique gift ideas from our technical team:
For Pi(e) Lovers
Raspberry Pi 4 Complete Starter Kit
This 4GB kit is exclusive to CanaKit and includes everything you need to get started with a Raspberry Pi mini computer. While it’s aimed at young minds, this is a great gift for anyone who’s interested in coding and learning programming skills with an affordable, portable tool.
Raspberry Pi 1 Model B+
The Model B+ is the final revision of the original Raspberry Pi. It replaced the Model B, and has several upgrades including more USB ports and better audio. This inexpensive mini computer is ideal for DIY projects.
HiFiBerry Amp+
This power amplifier mounts onto the Raspberry Pi to create a stereo audio system. Our CEO Tobias McNulty would love one of these to stream music to an old pair of speakers. It can also be used as a building block for multi-room audio installations.
BrickPi3 Base Kit
The BrickPi3 by Dexter Industries makes the sky the limit with LEGO MINDSTORMS (a LEGO platform for creating programmable robots). You can connect LEGO sensors and motors to the Raspberry Pi to build your own robot.
For Greater Peace of Mind
A Year of Backblaze Cloud Backup
Backblaze offers unlimited backups for a set price. According to our Technology Support Specialist Scott Morningstar, Backblaze found an innovative way to use consumer hard drives in their data centers that keeps costs down. They also report drive statistics and failure rates every quarter. No backup is perfect, but Backblaze comes close. Scott’s family uses it, making his family tech support during the holidays much easier. Backblaze provides unlimited cloud storage and runs on Windows and MacOS. You can purchase a year of unlimited Backblaze backups as a gift.
Ultrastar 12 TB Hard Drive
These mighty hard drives are built for maximum storage. They’re designed to handle workloads up to 550TB per year. It also has a high reliability rating and is not a power hog.
For Tuning In & Rocking Out
Bose QuietComfort 35 Wireless Headphones II
These wireless, noise cancelling headphones are ideal for air travel or drowning out the background noise while your work. Google Assistant and Amazon Alexa are also built in to provide hands-free access to millions of songs, playlists, and more. They’re also enabled with Bose AR, a first-of-its-kind audio augmented reality platform. Our COO Nicole Foster would especially love a pair in rose gold (nice choice!).
Bluetooth Beanie
There are lots of these on the market — here are the top ones, according to BestReviews, which lists pros and cons of each. Our CBDO Ian Huckabee says he’d love to have one of these to keep warm and listen to his favorite music or podcasts while walking his dog.
Korg Volca Keys or Sample
Full-size synthesizers are a bewitching array of knobs, buttons, and patch cords. Korg’s Volca line is meant for those who are interested in figuring out how they work with minimal cost — sort of a ukulele equivalent for the Kraftwerk crowd. Volca Keys is an analog synthesizer, and the Volca Sample is a sample-based drum machine. Both have built-in sequencers and plenty of knobs and options to fiddle with and bend their sound, which is most of the appeal, says our Lead Project Manager Gannon Hubbard.
For the Homebody
Sous Vide with WiFi
This is perfect for the hardworking techie who wants a home cooked meal when they get home. Sous vide is a cooking method in which food is placed in a plastic pouch or a glass jar and cooked in a water bath with an instrument that precisely controls the temperature. Our QA Analyst Kat Smith swears by the Anova brand sous vide that connects to wifi, so she can control and monitor the cooking temperature remotely. Kat says, “If you're a snob about the temperature of your steak this is the perfect gift, and you can cook safely while you’re not at home.”
Wyze Cam V2
This very affordable and sleek little smart home camera doesn’t skimp on quality. It’s equipped with 1080p HD video, night vision, and 2-way audio. You can connect it to a live stream app and enable the push notifications to alert you of motion or sound. It also works with Alexa and Google Assistant, and features continuous recording with local or cloud storage.
Casper Glow Light
The Glow Light is a portable warm, soft light that helps you to wind down for bed. You can set to gradually dim to help you fall asleep and then gradually turn on in the morning to help you wake up more naturally. It works with the Glow app so you can customize your bedtime and morning routines.
Secretlab OMEGA Chair
This customizable award-winning chair provides maximum comfort and support. Developer Christopher Dixon says this “dream chair” would be a great addition to anyone’s gaming or office space. If you need a nap from gaming or working, it also includes a full-length backrest that reclines almost fully.
For Staying in Touch
Latest Apple iPhone
The iPhone 11, 11 Pro, and 11 Pro Max are the latest and greatest smartphones by Apple. See how they compare to each other. Members of our team (including myself) said they’re specifically interested in one of these models because of their storage capacity, battery life, and advanced camera.
Solar Phone Charger
Solar technology has improved dramatically over the last few years, and the new models of solar phone chargers can pack a surprising amount of power. They're small enough to take camping or to a football game, and are beneficial at home during a power outage or severe weather event. Our Account Executive Tim Scales suggests the the Venture 30 Power Bank + Nomad 7 Plus Solar Kit.
mAP lite
This very small, wireless access point packs a big punch for its size. It’s ideal for traveling because it can be used to extend your hotel internet to all your devices. It can also serve as a client device to improve your laptop signal range, or as a simple configuration tool for your servers, if there is no ethernet on your mobile device.
For the One Who Has Everything
Drop CTRL Mechanical Keyboard
This mechanical keyboard has a solid aluminum frame and is programmable using QMK software. It includes hot-swap switch sockets, which allows you to change switches without soldering. It’s outfitted with the fastest available connections. The RGB backlighting and underlighting is customizable.
1083 Heat Touch Hellfire Gloves
These Hellfires are the only gloves that our CBDO Ian Huckabee says would be more amazing than the “best gloves he ever owned,” which he had to toss out last winter. These high-tech, powered gloves can provide more than 12 hours of heat and are ideal for snowboarding, skiing, or other outdoor winter sports (dog sledding, anyone?).
mBot Robot Kit
This educational robot kit from Makeblock is great for little and big kids alike! Build your own robot from scratch and learn about a variety of robotic machinery and electronic parts, as well as the fundamentals of block-based programming.
X3 Hurricane
The X3 Hurricane is a variable speed, canless air system for cleaning computers, server rooms, camera lenses, and even circuit boards. It’s a powerful but quiet cleaner. One unit is equal to over 5,000 cans of a traditional duster.
The items on this list are independently suggested by Caktus Group team members, and are not sponsored or endorsed. Caktus Group does not have any affiliation with these items and will not receive any commission from the sale of these items.
Within the past year, my development team at Caktus worked on a project that required a front-end framework to build a fast, easy-to-use product for a client. After a discussion of frameworks such as React, Vue, and Angular, and their approaches, our team settled on using Vue.js, along with a Django back-end with Django REST Framework (DRF). Initially, we chose Vue because we were more familiar with it, rather than its current competitor React, but as we worked on the product, we ended up having a number of team discussions about how to organize our code well and avoid extra code debt. This blog outlines some of the development patterns we chose as we worked through a number of issues, such as simplifying a multitude of almost identical Vuex mutations, finding a consistent way of holding temporary state, and working with nested objects on the front-end and back-end.
Note: this blog post assumes familiarity either with Vue.js, or a similar front-end framework, like React or AngularJS.
Issue 1: Almost Identical Mutations
In our use of Vue.js, we chose to use the Vuex library for state management in a store, as recommended by the Vue documentation. Following Vuex documentation, we created a number of mutations to alter the state of objects inside of our store. For example, a user in our store (in this blog post’s examples, we’ll focus on this user object) could look like:
This worked great when having only a few objects in the store, but this pattern became increasingly repetitive when both the number of objects in the store (user, company, address, etc.) and the number of properties for each object (first name, middle name, last name, gender, date of birth, etc.) increased. It became clear that having a mutation for each property (updateUserFirstName, updateUserMiddleName, updateUserLastName, updateUserDateOfBirth, etc.) led to redundant lines of mutation code. As a result, we wanted to find a less repetitive, and a more DRY (Don’t Repeat Yourself) way of doing so, in order to keep our codebase more readable and maintainable. After some discussion and coding, we created a generic and flexible mutation that would allow us to update any property of the user object:
updateUserAttributes(state,updatesObject){/* Update specific attributes of the user in the store. */state.user=Object.assign(deepcopy(state.user),updatesObject)},
This way, we could update the user’s first name and last name by calling
from any of our Vue components, which is both less lines of code (less code debt), and easier to write.
Note: The code above uses a deepcopy() function, which is a utility function we wrote for copying an object, and all of its nested fields and objects so that the copied object’s nested data no longer points to the original nested data, using lodash’s cloneDeep() method. As we found out earlier, simply using Object.assign() creates a new object with nested fields that point to the original object’s nested fields, which becomes problematic if the new object gets committed to the store, but then the original object’s nested field gets changed, surprisingly leading to the same field change on the new object.
Issue 2: Using Store Or Components For Temporary State
Another issue we discussed at length had to do with how to manage the state of what the user was doing on a particular page. For example, if the user is editing their personal information on a page, we could hold that data in the store, and end up updating the data in the store as the user types on their keyboard. Alternatively, we could let the Vue component hold onto the data, and save them in the back-end when the user presses 'Save,' and wait for the next page load to load it into the store. Or, we could let the Vue component hold onto the data and update it in the store and the back-end when the user clicks ‘Save.’ Here’s a breakdown of pros and cons to each approach:
only refer to store
update store and backend on 'Save'
update backend on 'Save'
only 1 place to
manage state
multiple places to manage state
(store for permanent state and
components for temporary state)
multiple places to manage state
permanent state and
components for temporary state)
user sees changes immediately
user sees changes on when clicking 'Save'
user sees changes in other components
on page load
user sees that clicking 'Save'
changes the page
user may not want to see changes in
multiple components while typing
user may not want to see changes in
multiple components while typing
forcing the user to click 'Save' gives a
clearer indication of what the user is doing
forcing the user to click 'Save' gives a
clearer indication of what the user is doing
Though making the changes instantly in the store took away the need to manage data in the Vue component, we ultimately decided that it would be more reusable and useful to save the changes both in the front-end store and in the back-end API when the user clicked a ‘Save’ button. Moreover, having the user click a ‘Save’ button seemed a more clear indication that they want to save a piece of data than just typing a value in a field. As a result, we ended up utilizing Vue components’ data to hold the local state of the objects, and sent those objects to both the back-end and the store when the user clicked ‘Save.’ In such a pattern, a user detail page could look something like this:
<template>
...
</template>
<script>
export default {
name: 'UserDetailPage',
data () {
return {
localUser: {}
}
},
methods: {
saveUser () {
/* Make an API call to the backend, and update the store. */
...
}
}
}
</script>
So the data that the user was editing would be held in the localUser object, and when the user clicks ’Save,’ the data would be sent to both the back-end and the store.
Issue 3: Nested Objects
A third issue we had a number of discussions about was how to manage an object’s relations to other objects. From our user example above, a user likely has an address (which is its own object), and the user could have a number of siblings, who have their own addresses, and the user could work at a company, which also has an address, which could have employees, who have their own addresses. Very quickly, the user object could have many layers of nested data, which can become rather unmanageable:
One solution for avoiding unmanageable nesting would be to manage each object’s nested relations with only the related object’s ID in the nested fields:
This makes each object’s data much smaller in the store, and decreases repetition of data (for example, when a user’s address matches their sibling’s address, we don’t need to have multiple places with the same data; only referring to the object’s ID should be enough).
However, this approach makes communication between the front-end and the back-end more complex. For instance, the user may want to edit their own information on their detail page to:
change their email
change their address
add a child
In such a case, the front-end may have to make three API calls (one to create an address, one to create a child, and one to update user.email, user.address, and user.children), which could take significantly longer than just making one API call to update everything related to the user. As an aside, learn more about creating an API endpoint in this post.
Seeing the pros and cons of each approach, we had some discussion about which way to develop, and after working through several API endpoints, we ended up with the nested approach, though we also tried to limit the amount of nesting that the back-end would be forced to handle. Django REST Framework does support editing an object’s relations to other objects by writing custom update() or create() serializer methods (read more here) and we were able to write these methods for some of our serializers. As a result, we were left with some nested data, but tried to be cognizant of how many nested relations we were using, and to limit them when possible.
Team Discussions Led to Sustainable Decisions
Throughout the course of the project, we continued to have conversations about the development patterns we were forming and following, and we attempted to make decisions that would lead to a more maintainable app with less code debt. Through these team discussions, we were able to come up with solutions that worked for us in terms of handling very similar Vuex mutations, handling temporary state, and working with nested data. As a consequence, we were able to deliver a product that satisfied our client’s needs, and is maintainable for our future work. We have such conversations as we continue to build apps that are well-built, maintainable, and work well for our users.
I recently attended All Things Open in Raleigh. Bringing together more than 4,000 attendees, it is the largest open tech event on the East Coast, and is focused on “exploring open source, open tech, and the open web.” This year, ATO included more than 240 sessions across 22 tracks, ranging from front-end development to internet-of-things to studies of open data in government. The event was much larger than the last time that I attended in 2017.
Some of the more interesting sessions I attended were related to education and people’s experiences working with open data. In particular, I enjoyed Enthusiastic “Yes!” in Mathematics Education by Maria Droujkova & Dmitri Droujkov. As a former teacher, I found their examples of making mathematics education more open to students’ choices and voices compelling, and also relevant to teaching and mentoring in technology.
Another interesting talk was Roy Derks’ Open-sourcing JavaScript at the City of Amsterdam, where I learned a lot about the City of Amsterdam’s efforts to not only make their code a public repository, but, more importantly, to share data in a way that allowed people to make sense of the data, and to gain knowledge about the city as a result. For example, the city website includes searchable city maps, which provide information about a variety of topics like electric charging stations and bike parking facilities. He also mentioned an app built for reporting the need for garbage pickup across the city.
Finally, another interesting session was Turn Up the Fun (Gamifying Education) by Veethika Mishra, which provided examples for how including games in teaching can both make the learning more interesting for students and more accessible to those who may not otherwise engage with the subject. In terms of coding, games can help students become better at solving coding tasks without the need for being good at the traditional computer science curriculum.
Through my time at the conference, I learned about some ways in which open source technology can be used to provide meaningful information to people, as well as thinking about ways in which the open-source idea can improve education. I hope to include these ideas in my future work.
The event was also focused on diversity and inclusion. The speakers were diverse and so was the large crowd. It was great to see so many people supporting the open source community and open source development, of which Caktus is a proud participant. Events like All Things Open are beneficial not only because they bring us together but also because of the important exchange of ideas. The next ATO is already scheduled for October 18 - 20, 2020 at the Raleigh Convention Center.
Based on a Twitter
discussion about
optimizing the performance of web apps, I implemented a very crude polyfill for
an idea I recently had.
HTML forms support multiple target values, i.e. where should the result of
submitting the form be displayed, but there is no value for of target for
inline, that is, you submit the form and instead of the whole page refreshing
the server returns HTML that takes the place of the Form contents.
The idea is that there would be a new target type for HTML Forms, an attribute
of target=_inline would mean the that Form would be processed by the server
and the contents of the Form would be replaced with the HTML returned by the
server. That is, on submit the form values would be sent to the server, either
POST or GET, and the response should be HTML that the browser will simply
.innerHTML on the form that was submitted.
On behalf of the Jython development team, I'm pleased to announce that the second beta release of Jython 2.7.2 is available! This is a bugfix release.
Please see the NEWS file for detailed release notes. This release of Jython requires JDK 8 or above.
This release is being hosted at maven central. There are four main distributions (including a new "slim" version). In order of popularity:
Most likely, you want the traditional installer. NOTE: the installer automatically installs pip and setuptools (unless you uncheck that option), but you must unset JYTHON_HOME if you have it set. See the installation instructions for using the installer
To see all of the files available including checksums, go to the maven query for org.python+Jython and navigate to the appropriate distribution and version.
I recently completed a three-month internship with Caktus Group. This is a major accomplishment for me because two years ago, to the day, I was working as a consultant in the education services industry. I was inspired to pursue a career in web development after I came across a six-month course teaching full-stack Javascript. In sharing my experience, I hope to shed light on what it’s like working as an intern at Caktus, and give professionals looking to transition into web development some advice that may be useful once they land an entry-level position. In general, my experience felt a little like this:
Day 0
Going into my internship with Caktus, I had three months of work experience as a software developer at BruVue, a start-up in the IoT space. My time with BruVue left me with a strong understanding of React.js and working with technologies such as Express.js, Mongoose ODM, and Knex Query Builder to build decoupled applications. I also had the opportunity to build an npm package called in-orbit, a React library of loading icons.
Day 1
The skills I gained at BruVue set a great foundation for my internship. I joined Team Discocaktus (Disco) as a front-end developer focusing on the Single-Page Application we were implementing using Vue.js and a Django REST Framework backend. Over the course of my internship the team got me acclimated to the project slowly in an Agile environment by first giving me bugs to fix, followed by UI tasks, and eventually full-stack tickets. The gradual onboarding process Team Disco employed gave me a better understanding of exactly what we were building, the technical requirements, and how our Django powered tech stack meets our clients’ needs. A common theme that developed throughout my internship was the need for me to learn through trial and error.
I was able to re-package the concepts I learned in React to build out stateful Vue components and transfer state throughout the app using the Vuex store; however as tickets began taking on a backend focus, I became increasingly dependent upon our Slack channels, ad hoc meetings, and the peer review process to bridge the gap in my understanding of Django and Django REST Framework. I’m quite thankful for the patience and time investment the Disco team has given me.
Day 90 & Insights
It’s amazing to reflect on all of the skills I have gained over the course of my internship. I now have a working understanding of Vue, Django, Django REST Framework and Python. In recent weeks, my role has expanded and I now help maintain projects for a variety of clients. I’m thrilled to say that I’ve started a full-time position with Caktus Group. Although internship experiences vary as broadly as the companies offering them, I believe the below advice can help anyone making the transition into tech.
When starting a new career as a software developer, I believe it is natural to feel indecisive about the quality of code you are writing, but understand that software development is a field where you need to write code in order to gain feedback and improve your skillset. My mantra has been “code until something breaks, troubleshoot, and code until something else breaks.”
Be correctable! You are an intern. You are not expected to know everything. Ask questions and make sure to incorporate constructive peer feedback. You’ll become a better programmer and more adept at working within a collaborative environment.
No matter what tech stack you are using, become familiar with the debugging tools at your disposal. I was able to learn so much about Python and Django by applying a generous amount of Python Debugger’s set_trace() method, and print() statements throughout the code I was working on.
Thanks for reading about my experience and I hope you found the above advice helpful. In the future, as my skills as a developer continue to take shape, I hope to bring you technical posts on topics ranging from Graphene-Django to Styled-Components to Python’s CSV Module. In the meantime, continue toward your goal of working within tech by building better projects, networking through meetups, and taking advantage of internship opportunities as they arise. With hard work and dedication these opportunities can translate into full-time roles.
Above: Django Fellow Carlton Gibson gives a talk on "Your Web Framework Needs You: An Update."
Again this year, DjangoCon more than delivered on its promise of something for everyone. A keynote on burnout and balance? Thoroughly entertaining! Jessica Rose’s talk left us in stitches. Who knew a talk about occupational burnout could be so fun? It was one of a wide range of subjects covered in three days of talks, bookended by a day of tutorials and two days of sprints. The conference took place in San Diego and ran from September 22 - 26. What a great week of learning and networking!
Caktus was honored to sponsor again this year, our tenth in a row. Attending this year from Caktus were Karen Tracey, lead developer and technical director; Jeremy Gibson, developer; Erin Mulaney, contractor developer; and myself, Ian Huckabee, CBDO.
The Talks …
The sessions packed a strong lineup of speakers with a diversity of topics, and this year, every word was captured with closed captioning.
There were the practical: Luan Fonseca, for example, went to great lengths to ensure you’re never left with technical debt in this era of sustainable software. He addressed how to find it, how to deal with it, and how to keep it low through pragmatic code and architectural design patterns. 🎥 Watch now.
There was the unspeakable: Melanie Crutchfield invoked the horror of discovering you didn't write tests(!) and showed how having to retrofit untested Django code can make you a believer in the magic of testing. 🎥 Watch now.
There was the philosophical: Daniele Procida’s brilliant talk, a favorite among our Cakti, related nothingness and identity in Python and Django to utopia, politics, and theories of agency, leaving attendees with a new dimension to their understanding of the discipline. 🎥 Watch now.
And there was the inspirational: Erin Mullaney, whom we proudly claim as one of our own, a former Caktus staffer and now successful independent contractor, showed how easy it can be to roll your own tech job and start a business or side hustle from scratch, complete with tips on overcoming any fears of becoming your own boss. 🎥 Watch now.
Kudos to the organizers for a second year at this location. San Diegans complain that when the temperature is below 70 degrees it’s too cold, and when it’s in the 80s it’s too hot. By these standards, it was perfect all week. Which made the hotel’s well-appointed breezeway perfect for between-session breaks to code, Slack, or hold a quick meeting in the sun. And the walking trails provided opportunities for head-clearing, light exercise, and, in the case of some attendees, a chance to get their minds around nothingness and identity in Python and Django.
#DjangoCon is my new favorite conference! Learned a ton, had a blast, and made many new friends. Love this community! ⛵️🥰
This six-day international conference “for the community by the community” was filled with a diverse and welcoming community. Django Fellow Carlton Gibson gave a quick update on changes that make it even easier to contribute to Django and offered ideas on how members of the Django community can get involved, and event organizers offered free workshops on contributing to open source and contributing to Django. As Carlton proclaimed in the title of his talk, “Your web framework needs you!” 🎥 Watch now.
We held our second annual Caktus Mini Golf event on Wednesday evening at Tiki Town Adventure Golf, and this year we had 52 RSVPs! This mid-week break gave us a chance to enjoy the ocean breeze and make new friends. Thanks to everyone who came out, and congrats to our winner Chris Gillispie who received a $100 Amazon gift card for shooting the lowest score.
We had a blast at the Caktus Mini Golf event last night! ⛳Thanks to everyone from @djangocon who came out and joined us. And congrats to our winner Chris Gillispie who received a $100 Amazon gift card for shooting the lowest score! #DjangoConpic.twitter.com/eEHiNdjApo
We are pleased to continue serving the North Carolina community at-large through our semi-annual Charitable Giving Program. Twice a year we solicit proposals from our team to contribute to a variety of non-profit organizations. With this program, we look to support groups in which Cakti are involved or that have impacted their lives in some way. This gives Caktus a chance to support our own employees as well as the wider community. For the first half of 2019 we are pleased to support the following charities:
Alley Cats and Angels
Alley Cats and Angels is dedicated to improving the lives of stray, abandoned, and feral cats. They also work to reduce the number of homeless cats in the Triangle through adoption, farm cat, and spay/neuter assistance programs. Our Lead Developer and Technical Director Karen Tracey is a longtime volunteer for the organization. We are also lucky to have cats and kittens visit our office on a fairly regular basis.
The Museum of Life and Science’s mission is to “create a place of lifelong learning where people, from young child to senior citizen, embrace science as a way of knowing about themselves, their community, and their world.” Our Chief Business Development Officer Ian Huckabee is a current museum board member and sits on the executive and finance committees.
For a few years, Caktus CEO Tobias McNulty lived in Baltimore. During that time, he volunteered and sailed at the Downtown Community Sailing Center, a non-profit that has served the community since 1994. They provide sailing experiences for children and people with disabilities with the help of a large crew of volunteers. The organization, “[P]rovides quality educational and life enriching programs that promote self-esteem and teamwork through the joy of sailing. The Downtown Sailing Center is committed to promoting an environment of inclusiveness especially accessibility, especially to youth, persons with disabilities, and those with limited opportunity.”
InStepp is, “[A] community change agent that transforms women and adolescent girls who have found themselves in less than desirable life circumstances. Our gender-responsive training, educational and prevention programs help women and girls rise above the odds to succeed in the world of work, and in all aspects of their lives. This, in turn, helps our communities thrive through empowered women that are caring, giving workers, family members and citizens.” Caktus’ Operations Director Kel Hanna nominated this organization for its great work in our community.
From the beginning Caktus has been a supporter and contributor to the open source community. As a team, we have really enjoyed using Jupyter Notebook on many of our projects. We have used it for exploratory coding and documenting repeatable steps for projects. Project Jupyter is committed to being an open source software program and we are proud to support that work and the open-source community that we love.
Caktus’ next round of giving will be Fall/Winter 2019, and we look forward to supporting another group of organizations that are committed to enriching the lives of North Carolinians!
Wagtail is a fantastic content management system that does a great job of making it easy for developers to get a new website up and running quickly and painlessly. It’s no wonder that Wagtail has grown to become the leading Django-based CMS. As one of the creators of Wagtail recently said, it makes the initial experience of getting a website set up and running very good. At Caktus, Wagtail is our go-to framework when we need a content management system.
Wagtail StreamFields are a particularly great idea: Rather than editing a big blob of Rich Text, you create a StreamField with the content blocks that you want — paragraphs, images, videos — and let editors create a stream of these blocks containing the content on the page. For example, with a video block, editors don’t have to fiddle with video embed codes, they just insert the video URL or shortcode, and the block handles all of the templating. This has the advantage of making it possible to have complex page content, without requiring the site editors to know how to model a complex design in HTML directly. It makes for a much better, more structured, and smoother editing experience. (StreamFields are such a great idea that WordPress has recently launched a similar feature — inspired by Wagtail?)
But…. There are some pain points for developers who work on large Wagtail projects. One of those is data migrations, particularly those that involve StreamFields. My informal survey of fellow developers yielded the following helpful comments:
“Wagtail migrations are evil.” —Eddie
“As a rule, avoid data migrations on streamfields at all costs.” —Neil
“It feels like we’re fighting the framework when we’re working programmatically with StreamField data, which is a real head-scratcher.” —name withheld
FWIW, I don’t think we’re exactly fighting the framework, so much as trying to do something that the framework hasn’t yet been optimized for. Wagtail has clearly been optimized to create a fantastic onboarding experience. And it’s really great. But it hasn’t yet been optimized for maintaining page data in an environment of shifting requirements. And so it’s currently really hard to do a data migration correctly.
The Caktus team was recently working on an existing Wagtail installation in which we were asked to migrate a page model from non-StreamFields to use a StreamField, giving editors greater flexibility and normalizing the page data. We were also asked, if possible, to migrate the existing pages’ data into the StreamField. That’s a pretty straightforward use case, and one that would seem to be a fairly common need: People start out their page models with regular ol’ fields, then they decide later (after building and publishing a bunch of pages!) that they want those pages to use StreamFields instead.
Considering all of this a worthy challenge, I rolled up my sleeves, dug in, and created a robust data migration for the project. It worked well, migrated all of the page and revision data successfully, and taught me a lot about Wagtail StreamFields.
At PyCon 2019, I hosted an open session on making Wagtail better for developers, and one of the things we talked about was data migrations (read more in my overview of PyCon 2019: “Be Quick or Eat Potatoes: A Newbie’s Guide to PyCon”). A couple of Wagtail core developers came to the session. I was pleased to learn that the method I used is essentially the same method that the Wagtail team has landed on as the best way to migrate StreamField data. So while this method isn’t yet officially supported in Wagtail, you heard it here first: This is currently the best way to do it.
How I migrated a Wagtail page model with a StreamField
Start with an Existing Page Model
To illustrate the method I used, I’ll set up a simple page model with
a title (as always)
a placed image
a body
a list of documents (that can be displayed as a grid, for example)
The code for this page model looks like this (omitting all the scaffolding of imports etc.):
# First version of the model.classExamplePage(Page):image=models.ForeignKey('wagtailimages.Image',null=True,blank=True,on_delete=models.SET_NULL,related_name='+')body=RichTextField()docs=StreamField([('doc',DocumentChooserBlock()),])content_panels=Page.content_panels+[ImageChooserPanel('image'),FieldPanel('body'),StreamFieldPanel('docs'),]
Example Page: Starting Point
Situation 1: Add New Fields to the Model without Moving or Renaming Anything
Now let’s suppose the customer wants to add pages to the docs block — they want to be able to display a link to a page in the grid alongside downloadable documents.
Here’s what the model looks like after adding a 'page' block to the 'docs' StreamField:
# Second version of the model: Added page block to the docs StreamFieldclassExamplePage(Page):image=models.ForeignKey('wagtailimages.Image',null=True,blank=True,on_delete=models.SET_NULL,related_name='+')body=RichTextField()docs=StreamField([('doc',DocumentChooserBlock()),('page',PageChooserBlock()),])content_panels=Page.content_panels+[ImageChooserPanel('image'),FieldPanel('body'),StreamFieldPanel('docs'),]
You can create and run this migration, no problem and no worries, because you haven’t moved or changed any existing data.
Rule 1: You can add fields to the model, and new blocks to StreamFields, with impunity — as long as you don’t move or rename anything.
Situation 2: Create Data Migrations to Move Existing Data
Some time later, the customer / site owner / editors have written and published a hundred pages using this model. Then the fateful day arrives: The customer / site owner / editors have enjoyed working with the docs field, and now want to move all the page content into a StreamField so that they can have a lot more flexibility about how they structure the content.
Does this sound familiar?
It’s not hard to write the new model definition.
# End result: The model after content has been migrated to a StreamField:classExamplePage(Page):content=StreamField([('image',ImageChooserBlock()),('text',RichTextBlock()),('docs',StreamBlock([('doc',DocumentChooserBlock()),('page',PageChooserBlock()),])),])content_panels=Page.content_panels+[StreamFieldPanel('content'),]
Now, it goes almost without saying: Do not create and run this migration. If you do, you will have a VERY angry customer, because you will have deleted all of their content data.
Instead, you need to break up your migration into several steps.
Rule 2: Split the migration into several steps and verify each before doing the next.
You’ll notice that I chose a different name for the new field — I didn’t, for example, name it “body,” which currently exists as a RichTextField. You want to avoid renaming fields, and you want to do things in an orderly way.
So, here are the steps of a Wagtail data migration.
Step 1: Add fields to the model without moving or renaming anything.
Here’s the non-destructive next version of the model.
# Data Migration Step 1: The model with the `content` StreamField added.classExamplePage(Page):# new content StreamFieldcontent=StreamField([('image',ImageChooserBlock()),('text',RichTextBlock()),('docs',StreamBlock([('doc',DocumentChooserBlock()),('page',PageChooserBlock()),],null=True)),])# old fields retained for nowimage=models.ForeignKey('wagtailimages.Image',null=True,blank=True,on_delete=models.SET_NULL,related_name='+')body=RichTextField()docs=StreamField([('doc',DocumentChooserBlock()),('page',PageChooserBlock()),])content_panels=Page.content_panels+[StreamFieldPanel('content'),# old panels retained for nowImageChooserPanel('image'),FieldPanel('body'),StreamFieldPanel('docs'),]
The content field has to allow null values (null=True), because it’s going to be empty for all existing pages and revisions until we migrate the data.
Step 2: Create a data migration that maps / copies all the data from the old fields to the new fields, without modifying the existing fields. (Treat existing data as immutable at this point.)
This is the hard part, the fateful day, the prospect of which makes Wagtail devs run away screaming.
I’m here to encourage you: You can do it. Although this procedure is not well-documented or supported by Wagtail, it works reliably and well.
So, let’s do this. First you’ll create an empty migration
You’ll end up with an empty migration. For the “forward” migration, you’ll add a RunPython operation that copies all the content data from the existing fields to the new StreamField.
You can also create a “reverse” operation that undoes the changes, but I usually prevent reverse migrations — life is hard enough as it is. However, it’s up to you, and the same kind of procedure can work in reverse.
Here’s what things will look like so far:
defcopy_page_data_to_content_streamfield(apps,schema_editor):raiseNotImplementedError("TODO")defprevent_reverse_migration(apps,schema_editor):raiseNotImplementedError("This migration cannot be reversed without"+" inordinate expenditure of time. You can"+" `--fake` it if you know what you're doing,"+" and are a migration ninja.")classMigration(migrations.Migration):dependencies=[('home','0005_add_content_streamfield'),]operations=[migrations.RunPython(copy_page_data_to_content_streamfield,prevent_reverse_migration,)]
The copy_page_data_to_content_streamfield(…) function will copy all page and revision data from the existing fields to the new content StreamField. Here’s what it looks like:
defcopy_page_data_to_content_streamfield(apps,schema_editor):"""With the given page, copy the page data to the content stream_data"""# if the ExamplePage model no longer exists, return directlytry:ExamplePage=import_module('home.models').ExamplePageexcept:returnExamplePage=import_module('home.models').ExamplePageforpageinExamplePage.objects.all():page_data=json.loads(page.to_json())content_data=page_data_to_content_streamfield_data(page_data)ifcontent_data!=page.content.stream_data:page.content.stream_data=content_datapage.save()forrevisioninpage.revisions.all():revision_data=json.loads(revision.content_json)content_data=page_data_to_content_streamfield_data(revision_data)ifcontent_data!=revision_data.get('content'):# StreamField data is stored in revision.content_json in a string fieldrevision_data['content']=json.dumps(content_data)revision.content_json=json.dumps(revision_data,cls=DjangoJSONEncoder)revision.save()
There are several things to notice here:
We’re importing the ExamplePage definition from home.models rather than via apps.get_model(). This allows us to use the ExamplePage.to_json() method. We have to import the model during the migration using importlib so that future model changes don’t break the migration. (Never import from the app’s models at the module-level of a migration.) We also need to put the import into a try/except block, in case the model is deleted in the future.
Using page.to_json() puts the page_data into the same form as the page revision data, which makes it much easier to do a data migration (one function for both page data and revision data)
We’re using regular Python data structures – dicts, lists, etc. This turns out to be a lot easier than trying to build StreamValues directly.
We’re using the same helper function, page_data_to_content_streamfield_data(…) (which we haven’t yet created) for both the page data and all revisions data. (We’ll develop this function next.) We can use the same helper function for page data and revisions data because the data structures are the same when represented using Python data structures.
The content data in revisions is stored in a JSON string. No problem. We just use json.loads() and json.dumps() with the DjangoJSONEncoder (DjangoJSONEncoder is not entirely necessary here because we don’t have any date or datetime fields in this model, but it’s a good practice to use it in Django projects).
Next, we need to implement the page_data_to_content_streamfield_data() function. This function takes a Python dict as its only argument, representing either the page data or a revision’s data, and returns a Python list, representing the data to be placed in the new content StreamField. It’s a pure function, with no side-effects, and that means it doesn’t mutate the page or revision data (which is only a copy anyway).
To build this function, it’s helpful to start with the definition of the content StreamField, and use it to build a Python data structure that contains the existing data. Here is the content StreamField definition again:
StreamField definitions use a list of tuples, but the stream_data that we’re building uses a list of dicts, which will look like this:
defpage_data_to_content_streamfield_data(page_data):"""With the given page field data, build and return content stream_data: * Copy existing page data into new stream_data. * Handle either the main page data or any revision data. * page_data is unchanged! (treated as immutable). """content_data=[{'type':'image','value':...},{'type':'text','value':...},{'type':'docs':'value':[...]},]returncontent_data
We need to fill in the values for each 'value' field. The 'image' and 'text' are easy: We just need to copy in the 'image' and the 'body' values from the page_data.
The 'docs' value is going to be little harder — but not much! We have to do is take the stream_data from the existing ‘docs’ field. Since ‘docs’ is a StreamField, it is stored as a string in the page_data that comes from json. When loaded, here’s what that field looks like — a typical stream_data value:
We’re simply going to load the json value and copy over the data, filtering out the ‘id’ fields (Wagtail will assign new ones for us).
Here’s what the final version of the function looks like:
defpage_data_to_content_streamfield_data(page_data):"""With the given page field data, build and return content stream_data: * Copy existing page data into new stream_data. * Handle either the main page data or any revision data. * page_data is unchanged! (treated as immutable). """return[{'type':'image','value':page_data['image']},{'type':'text','value':page_data['body']},{'type':'docs','value':[{key:block_data[key]forkeyin['type','value']}# no 'id'forblock_datainjson.loads(page_data['docs'])]},]
That’s it! We’re just mapping the existing data to the new data structure, without changing any of the existing data. It reminds me a little bit of using XSLT to declare a transformation from one data schema to another.
Now we can run the data migration! When we do so, we see all of the existing page data populating the page content field.
Example Page with Added Fields and Data Migration Applied (Steps 1 & 2)
Step 3: Deploy the migration and let editors review everything, making sure that all the data was correctly copied.
Step 4: Switch the site templates / API to the new fields. By making this a separate step before deleting the old data, we make sure that we haven’t missed anything before we pass the point of no return. (As our CEO and Co-founder, Tobias McNulty pointed out while reviewing this post: “Extra reviews never hurt — plus, you'll have no way to revert if the new templates introduce some non-trivial breaking changes (and you've already deleted your model fields.”)
It’s a good idea not to delete any production data until the customer / site owner / editors are satisfied. So we deploy the site at this point and wait for them to be satisfied that the old data has migrated to the new fields, and that the site templates / API are correctly using the new fields.
Step 5: Create final migration that deletes the old data, and deploy it with updated templates that use the new fields. This is the point of no return.
Now your model can finally look like the “end result” above.
# End result: The model after content has been migrated to a StreamField:classExamplePage(Page):content=StreamField([('image',ImageChooserBlock()),('text',RichTextBlock()),('docs',StreamBlock([('doc',DocumentChooserBlock()),('page',PageChooserBlock()),])),])content_panels=Page.content_panels+[StreamFieldPanel('content'),]
Creating this final migration is super easy: Just delete all the old fields and content_panels from the model, let Django create the migration, and apply it. We’ve removed ,blank=True, null=True from the content field definition, because now that the migration has been applied, every instance of the content field should now be non-null. (Django’s makemigrations will ask you what to do about null existing rows. None of them should be, so you can choose 2) ignore for now… when makemigrations prompts you. Or you can just leave in the null=True parameter on the content field.)
Example Page in Final Form, with Old Fields Removed (Step 5)
Summary: Rules for Wagtail Data Migrations
You can add fields to the model, and new blocks to StreamFields, with impunity — as long as you don’t move or rename anything.
If you are moving or renaming data, split the migration into several steps
Step 1: add new fields that will contain the post-migration data
Step 2: create a data migration that maps / copies all the data from the old fields to the new fields. Do this without modifying the existing fields. (Treat existing data as immutable at this point.)
Step 3: you might want to pause here and let the editors review everything before changing the templates to use the new fields.
Step 4: switch the site templates / API to the new fields.
Step 5: once the editors are happy, you can create a migration that deletes the old fields from the model.
Data migrations involving StreamFields are best done by writing directly to the stream_data property of the StreamField. This method:
allows the use of a json-able dict (Python-native data structure), which is a lot easier than trying to build the StreamValue using Wagtail data structures.
allows using the same function for both the page data and the page revisions data, keeping things sane.
is not officially supported by Wagtail, but can be said to be sanctioned by at least a couple of Wagtail core developers.
The repository containing the worked example is available at github.
Migrate with Confidence
There’s no question that migrating page data involving Wagtail StreamFields is an involved process, but it doesn’t have to be scary. By doing things in distinct stages and following the methods outlined here, you can migrate your data with security and confidence.
These days it’s easy to get swept up into the buzz around Python’s
strengths as a data science package, but Python is also great for the
more mundane, business process side of computing. One of the most
important business processes is generating reports, and the most used
and venerable form of report is the PDF. Python has a great library for
generating and manipulating PDFs:
ReportLab. I recently read
more about this extremely useful library in ReportLab: PDF Processing
with Python, by Michael Driscoll. With a few caveats, it’s an excellent
resource.
Python remains a great choice for the stuff that no one ever got rich on
Patreon writing or talking about. Things
like processing spreadsheets (which pandas is great at, by the way),
mail-merge and of course, arguably one of the most important business
activities, generating PDF reports. For this, Mike Driscoll’s book is a
great introduction, tutorial, and resource for any Python programmer
looking to get into the exciting world of programmatically generated
Quarterly TPS reports!
The Technical
This book is available in digital format (PDF natch), and can be found
on the author’s
website.
There is a lot of content in this book. It contains 428 pages of
examples and deep dives into the API of the library. Seriously, if there
is something you wish you could do with a PDF and ReportLab can do it,
then this book will get you started.
The Good
Because the bitter is often softened by the sweet, I’ll start with the
sweet things about this book.
It is clear that the author, Michael Driscoll, knows ReportLab very
well, and he knows how to construct illustrative snippets of code that
demonstrate his material. From the start to finish this book is full of
clear, useful code that works (this cannot be underlined enough), the
code that is in the book will work if you copy it, which is sadly a
rarity for many resources about computing. Big publishing names like
O’Reilly and Wrox who have editorial staff often publish books with
broken examples. Full disclosure, I did not run every single piece of
code, but I did sample about 40% of the code and none of it was broken.
Driscoll also does a very good job of building up his examples. Every
book on programming starts with its “Hello, World!” example, and this
book is no exception, but in my experience, the poorer books out there
fail to continue a steady progression of ideas that layer logically one
on top of the other, which can leave a reader feeling lost and
frustrated. Driscoll, on the other hand, does a very good job of
steadily incrementing the work already done with the new examples.
Almost every example in this book shows its result as an embedded image.
This, of course, makes sense for a book about a library that works with
PDFs. It is also another one of those touches that highlight the
accuracy of the code. It’s one thing to say, “Hey, cool, the code I just
worked through ran,” and another to be able to compare your results
visually with the source.
The Not So Good
I have one major complaint about this book and a few minor editorial
quibbles.
Who is the intended audience for this book?
While the parts of the book that actually deal with ReportLab are
extremely well organized, the opening of the book is a mess of
instructions that might turn off novice programmers, and are a little
muddled for experienced developers.
The first section “Conventions” discusses the Python prompt which
indicate a focus on beginners, but then the very next section jumps
right into setting up a virtual environment. Wait, I’m a beginner, what
is the “interpreter”? What is IDLE? What is going on here? On the flip
side, if this book was targeted at more experienced developers, much of
this could be boiled down into a single dependencies and style section.
The author also adds a section about using virtualenv and dependencies,
but the discussion of virtualenvs takes place before a discussion about
Python. For the beginner this could possibly stop them all together as
they tried to install virtualenv on a machine that doesn’t already have
Python installed.
To be fair, none of this is a problem for an experienced developer, and
with a specialized topic like working with a fairly extensive and
powerful library like ReportLab, the author can be forgiven for assuming
a more experienced readership. However, this should be spelled out at
the beginning of the book. Who is the book for? What skill level is
needed to get the most from the book?
Quibble: Code Styling Is Inconsistent
This is certainly a minor quibble — the code working is much more
important — but quite often I would see weird switches in style from
example to example and sometimes within examples.
First off, ReportLab itself uses lowerCamelCase for class methods
and functions rather than snake_case, which sometimes bleeds over
into the author’s choice of variable names. For example, on page 57, the
author is showing us how to use ReportLab to build form letters, and his
example contains the following variable styles:
Is this minor? Yes. Does it make my hand itch? Yes.
Quibble: Stick with a single way of doing things.
Sometimes the author switches between a Python 2 idiom and a Python 3
idiom for doing a thing. In the same code example I noted in the above
quibble, the author uses the Python 2 % operator to do string
interpolation, and in the same block of code throws in a Python 3
.format() for the exact same purpose. I noticed this only a couple
of times so again — minor. But these sorts of things can throw a new
developer who is trying to grasp the material and perhaps a new
language.
Conclusion
If you are interested in learning how to automate the generation of PDFs
for your projects and you plan on using ReportLab, then this book is a
great choice. It covers in detail every aspect of the ReportLab library
in a clear and iteratively more complex manner. Also, the code examples
work!
Aside from a slightly unfocused introduction, which could hinder a new
developer from approaching the material and some style inconsistencies,
the author has produced a solid instructional book. It’s a great
reference when you need to brush up on how to accomplish some arcane bit
of PDF magic.
Note: This review was solicited by the author of the book, and my
company received a free copy for review. However, all opinions are my
own.
I caught up with both of them recently to ask a few questions about their interest in Elixir and Phoenix, which is the premiere framework for web development with Elixir.
How did you learn about Elixir and Phoenix? What most excites you about each?
Vinod: I'm intrigued by functional programming languages. Like many Cakti, I don't have formal computer science training, but early in my career, I listened to some lectures from Berkeley that were presented in the functional language Scheme, and ever since then I've been interested in functional languages. Elixir has come across my radar many times, but I didn't really get interested until our former colleague Neil started talking about it, which eventually led to a recent internal project. That project was really fun, and it showed me how quickly you can go from zero to productive in Elixir and Phoenix. Since then, I've read and tinkered more. The more I look at it, the more interested I am.
I guess I'm most interested in being able to apply some of the benefits of immutability and simplicity that functional languages provide. Specifically, I'd like to learn more about how LiveView can allow us to build dynamic UIs using Elixir rather than Javascript.
Tobias: I first learned about Elixir during a ShipIt Day at Caktus in 2016. Since that time, a few potential clients have reached out to us about Elixir projects and we've continued to learn and exercise our Elixir skills. We recently did a fun project called Chippy, which uses Elixir, Phoenix, and LiveView. Chippy is a digital implementation of the traditional physical "chips" used by a development team to determine project allocations during sprint planning.
Coming from a Python/Django background, what excites me the most about Elixir is its entirely different approach to processes and concurrency. A single Elixir app might have hundreds, if not thousands, of concurrent processes (without suffering from something like the Global Interpreter Lock in Python), and has been proven to support massive numbers of concurrent network connections. These features may not be relevant to every project, but for some they make a lot of sense.
Caktus is traditionally a Python/Django shop. Why branch out?
Vinod: Caktus has been well served by Python and I don't expect that to stop anytime soon. But Caktus has always felt like the kind of place where I'm encouraged to try new things and see what we can learn from them. There is a lot to learn from something as different as Elixir.
Tobias: I love Python and Django, and it's still our web framework of choice for nearly all client projects. Django is a stable, battle-tested solution and its batteries-included philosophy makes it quick and easy to create web applications and backend APIs. That said, projects that demand a high level of concurrency or a large number of network connections might benefit from a language like Elixir. We're even exploring the possibility of using Elixir and Python side-by-side for a single project, since both Django and Phoenix have an affinity for the Postgres database.
What are you most looking forward to at the conference?
Vinod: I'm looking forward to being immersed in Elixir while being surrounded by Elixir enthusiasts. I'm hoping to get more familiar with the language, while finding inspiration on how to apply it to our own technical problems.
Tobias: Being relatively new to the community, I'm most looking forward to meeting people at the conference and learning how they use Elixir (especially for web projects).
👋 We hope to see you there! Tobias and Vinod will sport their Caktus shirts during the conference, and you can also connect with them on twitter via @TobiasMcNulty and @vkurup.
We’re looking forward to the international gathering at DjangoCon 2019, in San Diego, CA. The six-day conference, from September 22 - 27, is focused on the Django web framework, and we’re proud to attend as sponsors for the tenth year! We’re also hosting the second annual Caktus Mini Golf event.
⛳ If you’re attending DjangoCon, come play a round of mini golf with us. Look for our insert in your conference tote bag. It includes a free pass to Tiki Town Adventure Golf on Wednesday, September 25, at 7:00 p.m. (please RSVP online). The first round of golf is on us! And whoever shoots the lowest score will win a $100 Amazon gift card.*
Talk(s) of the Town
Among this year’s talented speakers is one of our own, Erin Mullaney (pictured). Erin has been with Caktus since 2015, and has worked as a contractor for us since July 2017. On Monday, September 23, she’ll share her experiences going from a full-time developer to a contractor in her talk, “Roll Your Own Tech Job: Starting a Business or Side Hustle from Scratch.” The talk will cover her first two years as a consultant, including how she legally set up her business and found clients. Erin said she enjoys being her own boss and is excited to share her experiences.
Caktus Developer Jeremy Gibson, who will attend DjangoCon for the first time, is looking forward to expanding his knowledge of Django best practices surrounding queries and data modeling. He’s also curious to see what other developers are doing with the framework. He’s most looking forward to the sessions about datastore and Django's ORM, including:
If you’d like to meet the Caktus team during DjangoCon, join us for our second annual Mini Golf Event. Or you can schedule a specific time to chat with us one-on-one.
During the event, you can also follow us on Twitter @CaktusGroup and #DjangoCon2019 to stay tuned in. Check out DjangoCon’s Slack channel for attendees, where you can introduce yourself, network, and even coordinate to ride share.
We hope to see you there!
*In the event of a tie, the winner will be selected from a random drawing from the names of those with the lowest score. Caktus employees can play, but are not eligible for prizes.
Pandas is a powerful Python data analysis tool. It's used heavily in the data science community since its data structures make real-world data analysis significantly easier. At Caktus, in addition to using it for data exploration, we also incorporate it into Extract, Transform, and Load (ETL) processes.
The Southern Coalition for Social Justice’s Open Data Policing website uses Pandas in various capacities. Open Data Policing aggregates, visualizes, and publishes public records related to all known traffic stops in North Carolina, Maryland, and Illinois. The project must parse and clean data provided by state agencies, including the State of Maryland. Maryland provides data in Excel files, which can sometimes be difficult to parse. pandas.read_excel() is also quite slow compared to its _csv() counterparts.
By default, pandas.read_excel() reads the first sheet in an Excel workbook. However, Maryland's data is typically spread over multiple sheets. Luckily, it's fairly easy to extend this functionality to support a large number of sheets:
importpandasaspddefread_excel_sheets(xls_path):"""Read all sheets of an Excel workbook and return a single DataFrame"""print(f'Loading {xls_path} into pandas')xl=pd.ExcelFile(xls_path)df=pd.DataFrame()columns=Noneforidx,nameinenumerate(xl.sheet_names):print(f'Reading sheet #{idx}: {name}')sheet=xl.parse(name)ifidx==0:# Save column names from the first sheet to match for appendcolumns=sheet.columnssheet.columns=columns# Assume index of existing data frame when appendeddf=df.append(sheet,ignore_index=True)returndf
The Maryland data is in the same format across all sheets, so we just stack the sheets together in a single data frame. Now we can load the entire Excel workbook:
We can add to this data, too. Maryland sends deltas of data, rather than an updated full data set. So it can be appended to the existing CSV data set by using mode=”a”:
Now we have a single CSV file with all of the data.
That's it! This certainly isn't a huge data set by any means, but since working with Excel files can be slow and sometimes painful, we hope this short how-to will be helpful!
It’s no secret that Caktus ❤️’s Wagtail, our favorite Django-based CMS. On July 25, I had the pleasure of attending Wagtail Space US, an annual convening and celebration of all things Wagtail at The Wharton School at the University of Pennsylvania. After a couple days of talks, workshops, and sprints, we’re even more excited by what Wagtail can offer us and our clients.
Tom Dyson, Technical Director at Torchbox, kicked off the first day with a “State of Wagtail” overview (pictured right; photo by Will Barton). He highlighted Wagtail’s increasing traction as the most popular CMS on the fastest growing software language (Python). From Google to NASA to the UK Government, big names are increasingly investing in Wagtail applications. At the same time, the Wagtail core contributor team is growing as more developers around the world invest their time and expertise to improve the core application. In short, it’s a good time to be working with Wagtail. Watch the full presentation.
On the topic of growth, several speakers discussed new extensions and distribution models for Wagtail. For example, Brian Smith and Eric Sherman from Austin’s Office of Design and Delivery demonstrated their new guided page creation model, which provides content creators with a templatized page creation wizard. This allows organizations running Wagtail to distribute content creation while maintaining a consistent voice and style guide.
Additionally, Vince Salvino from web development firm CodeRed gave a practical demonstration of CodeRed CMS, a new distribution tool that aims to streamline the deployment and delivery of Wagtail-based sites. CodeRed has open sourced this package, making it available for any team seeking to rapidly deploy and customize a Wagtail marketing site.
Two back-to-back talks focused on accessibility, including Torchbox’s Thibaud Colas (pictured; photo by Will Barton) discussing improvements to the tab stop and voiceover features of the Wagtail admin interface. See his presentation. Plus, Columbia’s Zarina Mustapha covered her team’s implementation of voiceover accessibility using custom Wagtail fields. Watch her presentation. The second day sprints also encouraged a focus on accessibility, and it’s great to see the Wagtail community embrace accessibility as a core requirement.
I particularly enjoyed Naomi Morduch Toubman’s talk on “Thoughtful Code Review.” Naomi is a Wagtail core contributor, and spoke about how developers can be positive and productive collaborators via code review on not just Wagtail but any shared project. Her message aligns closely with Caktus’s philosophy towards team-based software development, and could be considered required watching for any developer seeking to be a stronger team member.
Finally, Tim Allen — IT Director at The Wharton School and a driving force in the Wagtail community — gave an impassioned talk that explored both personal experience and technology. While the technology focus was on an organization’s adoption of Wagtail to improve communication and user experience, Tim also spoke eloquently about the power of proactive community and the sometimes lifesaving importance of inclusivity.
Caktus will be back at Wagtail Space in 2020, and we look forward to seeing the event’s continued growth and success!
I have been blogging bits and pieces over the years but Jon’s query has given
me a good excuse to roll all of that up into a single document.
For the last five years me and my team have been using web components to build
our web UIs. At the time I wrote the Zero Framework Manifesto we moved all of our
development over to Polymer.
Why Polymer?
We started with Polymer 0.5 as it was the closest thing to web components that
was available. At the time I wrote the Zero Framework Manifest all of the
specifications that made up web components were still just proposed standards
and only Chrome had implemented any of them natively. We closely followed
Polymer, migrating all of our apps to Polymer 0.8 and finally to Polymer 1.0
when it was released. This gave us a good taste for what building web
components was like and verified that building HTML elements was a productive
way to do web development.
How
One of the questions that comes up regularly when talking about zero frameworks
is how can you expect to stitch together an application without a framework? The
short answer is ‘the same way you stitch together native elements’, but I think
it’s interesting and instructional to look at those ways of stitching elements
together individually.
There are six surfaces, or points of contact, between elements, that you can
use when stitching elements together, whether they are native or custom
elements.
Before we go further a couple notes on terminology and scope. For
scope, realize that we are only talking about DOM, we aren’t talking about
composing JS modules or strategies for composing CSS. For the terminology
clarification, when talking about DOM I’m referring to the DOM
Interface for an element, not the element markup. Note that there is a
subtle difference between the markup element and the DOM Interface to such
an element.
For example, <img data-foo="5" src="https://example.com/image.png"/> may be
the markup for an image. The corresponding DOM Interface has an attribute of
src with a value of https://example.com/image.png but the corresponding DOM
Interface doesn’t have a data-foo attribute, instead all data-* attributes
are available via the dataset attribute on the DOM Interface. In the
terminology of the WhatWG Living Standard,
this is the distinction between content attributes vs IDL
attributes, and I’ll only be referring to IDL attributes.
With the preliminaries out of the way let’s get into the six surfaces
that can be used to stitch together an application.
Attributes and Methods
The first two surfaces, and probably the most obvious, are attributes and
methods. If you are interacting with an element it’s usually either reading and
writing attribute values:
element.children
or calling element methods:
document.querySelector('#foo');
Technically these are the same thing, as they are both just properties with
different types. Native elements have their set of defined attributes and
methods, and depending on which element a custom element is derived from it
will also have that base element’s attributes and methods along with the
custom ones it defines.
Events
The next two surface are events. Events are actually two surfaces because an
element can listen for events,
var e = new CustomEvent(‘some-event’, {details: details});
this.dispatchEvent(e);
DOM Position
The final two surfaces are position in the DOM tree, and again I’m
counting this as two surfaces because each element has a parent and can be
a parent to another element. Yeah, an element has siblings too, but that
would bring the total count of surfaces to seven and ruin my nice round
even six.
<button><imgsrc=""></button>
Combinations are powerful
Let’s look at a relatively simple but powerful example, the ‘sort-stuff’
element. This is a custom element that allows the user to sort elements. All
children of ‘sort-stuff’ with an attribute of ‘data-key’ are used for sorting
the children of the element pointed to by the sort-stuff’s ‘target’ attribute.
See below for an example usage:
<sort-stufftarget='#sortable'><buttondata-key=one>Sort on One</button><buttondata-key=two>Sort on Two</button></sort-stuff><ulid=sortable><lidata-one=cdata-two=x>Item 3</li><lidata-one=adata-two=z>Item 1</li><lidata-one=ddata-two=w>Item 4</li><lidata-one=bdata-two=y>Item 2</li><lidata-one=edata-two=v>Item 5</li></ul>
If the user presses the “Sort on One” button then the children of #sortable
are sorted in alphabetical order of their data-one attributes. If the user
presses the “Sort on Two” button then the children of #sortable are sorted in
alphabetical order of their data-two attributes.
Here is the definition of the ‘sort-stuff’ element:
And here is a running example of the code above:
Item 3
Item 1
Item 4
Item 2
Item 5
Note the surfaces that were used in constructing this functionality:
sort-stuff has an attribute 'target' that selects the element to sort.
The target children have data attributes that elements are sorted on.
sort-stuff registers for 'click' events from its children.
sort-stuff children have data attributes that determine how the target children will be sorted.
In addition you could imagine adding a custom event ‘sorted’ that
‘sort-stuff’ could generate each time it sorts.
Why not Polymer?
But after having used Polymer for so many years we looked at the direction of
Polymer 2.0 and now 3.0 and decided that may not be the direction we want to
take.
There are a few reasons we moved away from Polymer. Polymer started out and
continues to be a platform for experimentation with proposed standards, which
is great, as they are able to give concrete feedback to standards committees
and allow people to see how those proposed standards could be used in
development. The downside to the approach of adopting nascent standards is
that sometimes those things don’t become standards. For example, HTML Imports
was a part of Polymer 1.0 that had a major impact on how you wrote your
elements, and when HTML Imports failed to become a
standard you
had a choice of either a major migration to ES modules or to carry around a
polyfill for HTML Imports for the remainder of that web app’s life. You can
see the same thing happening today with Polymer 3.0 and CSS
mixins.
There are also implementation decisions I don’t completely agree with in
Polymer, for example, the default use of Shadow
DOM.
Shadow DOM allows for the encapsulation of the children of a custom element so
they don’t participate in things like querySelector() and normal CSS
styling. But there are several problems with that, the first is that when
using Shadow DOM you lose the ability to do global styling changes. If you
suddenly decide to add a “dark mode” to your app you will need to go and
modify each element’s CSS. It was also supposed to be faster, but since each
element contains a copy of the CSS there are performance
implications, though
there is work underway to address
that. Shadow DOM seems like
a solution searching for a problem, and Polymer defaults to using Shadow DOM
while offering a way to opt out and use Light DOM for your elements; I believe
the default should lie in the other direction.
Finally Polymer’s data binding has some mis-features. It offers two-way data
binding which is never a good idea, every instance of two-way data binding is
just a bug waiting to happen. The data binding also has a lot of magic to it,
in theory you just update your model and Polymer will re-render your template
at some point in the future with the updated values. The “at some point in the
future” is because updates happen in an async fashion, which in theory allows
the updates to be more efficient by batching the updates, but the reality is
that you spend a lot of development time updating your model, not getting
updated DOM, and scratching your head until you remember to either call a
function which forces a synchronous render, or that you updated a deep part of
your model and Polymer can’t observe that change so you need to update your
code to use the set() method where you give the path to the part of the
model you just updated. The async rendering and observing of data is fine for
simple applications, but for more complex applications leads to wasted
developer time debugging situations where a simpler data binding model would
suffice.
It is interesting to note that the Polymer team also produces the
lit-html library which is simply a
library for templating that uses template literals and HTML Templates to make
the rendering more efficient, and it has none of the issues I just pointed
out in Polymer.
What comes after Polymer?
This is where I started with a very concrete and data driven minimalist
approach, first determining what base elements we really needed and then what
library features we would need as we built up those elements, and finally what
features we need as we build full fledged apps from those base elements. I was
completely open to the idea that maybe I was just being naive about the need
for async render or Shadow DOM and I’d let the process of building real world
applications inform what features were really needed.
The first step was to determine which base elements we really needed. The
library of iron-* and paper-* elements that Polymer provides is large and the
idea of writing our own version of each was formidable, so instead I looked
back over the previous years of code we’d written in Polymer to determine
which elements we really did need. If we’d started this process today I would
probably just have gone with Elix or another
pure web components library of elements, but none of them existed at the time
we started this process.
The first thing I did was scan each project and record every Polymer element
used in every project. If I’m going to replace Polymer at least I should know
how many elements I’m signing up to rewrite. That initial list was surpising
in a couple of ways, the first was how short the list was:
Polymer/Iron elements Used
iron-ajax
iron-autogrow-textarea
iron-collapse
iron-flex-layout
iron-icon
iron-pages
iron-resizable-behavior
iron-scroll-threshold
iron-selector
paper-autocomplete
paper-button
paper-checkbox
paper-dialog
paper-dialog-scrollable
paper-drawer-panel
paper-dropdown-menu
paper-fab
paper-header-panel
paper-icon-button
paper-input
paper-item
paper-listbox
paper-menu
paper-menu-button
paper-radio-button
paper-radio-group
paper-spinner
paper-tabs
paper-toast
paper-toggle-button
paper-toolbar
paper-tooltip
After four years of development I expected the list to be much larger.
The second surpise was how many of the elements in that list really shouldn’t
be elements at all. For example, some could be replaced with native elements
with some better styling, for example button for paper-button.
Alternatively some could be replaced with CSS or a non-element solution, such
as iron-ajax, which shouldn’t be an element at all and should be replaced
with the fetch() function. After doing that analysis the number of elements
actually needed to be re-implemented from Polymer fell to a very small number.
In the table below the ‘Native’ column is for places where we could use native
elements and just have a good default styling for them. The ‘Use Instead’
column is what we could use in place of a custom element. Here you will notice
a large number of elements that can be replaced with CSS. Finally the last
column, ‘Replacement Element’, is the name of the element we made to replace
the Polymer element:
Polymer
Native
Use Instead
Replacement Element
iron-ajax
Use fetch()
iron-collapse
collapse-sk
iron-flex-layout
Use CSS Flexbox/Grid
iron-icon
*-icon-sk
iron-pages
tabs-panel-sk
iron-resizable-behavior
Use CSS Flexbox/Grid
iron-scroll-threshold
Shouldn’t be an element
iron-selector
select-sk/multi-select-sk
paper-autocomplete
No replacement yet.
paper-button
button
paper-checkbox
checkbox-sk
paper-dialog
dialog-sk
paper-dialog-scrollable
Use CSS
paper-drawer-panel
Use CSS Flexbox/Grid
paper-dropdown-menu
nav-sk
paper-fab
button
paper-header-panel
Use CSS Flexbox/Grid
paper-icon-button
button
button + *-icon-sk
paper-input
input
paper-item
nav-sk
paper-listbox
option/select
paper-menu
nav-sk
paper-menu-button
nav-sk
paper-radio-button
radio-sk
paper-radio-group
**
paper-spinner
spinner-sk
paper-tabs
tabs-sk
paper-toast
toast-sk
paper-toggle-button
checkbox-sk
paper-toolbar
Use CSS Flexbox/Grid
paper-tooltip
Use title attribute
** - For radio-sk elements just set a common name like you would for a
native radio button.
That set of minimal custom elements has now been launched as
elements-sk.
Now that we have our base list of elements let’s think about the rest of the
tools and techniques we are going to need.
To get a better feel for this let’s start by looking at what a web framework
“normally” provides. The “normally” is in quotes because not all frameworks
provide all of these features, but most frameworks provide a majority of them:
Framework
Model
Tooling and structure
Elements
Templating
State Management
All good things, but why do they have to be bundled together like a TV dinner?
Let’s break each of those aspects of a framework out into their own standalone
thing and then we can pick and choose from the various implementations when we
start developing an application. This style of developement we call “a la
carte” web development.
Instead of picking a monolithic solution like a web framework, you just pick
the pieces you need. Below I outline specific criteria that need to be met for
some components to participate in “a la carte” web development.
A la carte
“A la carte” web development does away with the framework, and says just
use the browser for the model, and the rest of the pieces you pick and choose
the ones that work for you. In a la carte development each bullet point is a
separate piece of software:
A la carte
Tooling and structure
Defines a directory structure for how a project is put together and provides tooling such as
JS transpiling, CSS prefixing, etc. for projects that conform to that directory structure.
Expects ES modules with the extension that webpack, rollup, and similar tools presume, i.e.
allow importing other types of files, see webpack loaders.
Elements
A library of v1 custom elements in ES6 modules. Note that these elements must be provided in ES6
modules with the extension that webpack, rollup, and similar tools presume, i.e.
allow importing other types of files, see webpack loaders.
The elements should also be “neat”, i.e.
just HTML, CSS, and JS.
Templating
Any templating library you like, as long as it works with v1 custom elements.
State Management
Any state management library you like, if you even need one.
The assumptions needed for all of this to work together are fairly minimal:
ES6 modules and the extension that webpack, rollup, and similar tools presume, i.e.
allow importing other types of files, see webpack loaders.
The base elements are “Neat”, i.e. they are JS, CSS, and HTML only. No additional
libraries are used, such as a templating library. Note that sets of ‘neat’ elements also conform
to #1, i.e. they are provided as webpack/rollup compatible ES6 modules.
Such code will natively run in browsers that support custom elements v1. To
get it to run in a wider range of browsers you will need to add polyfills and,
depending on the target browser version, compile the JS back to an older
version of ES, and run a prefixer on the CSS. The wider the target set of
browsers and the older the versions you are targeting the more processing you
will need to do, but the original code doesn’t need to change, and all those
extra processing steps are only incurred by projects that need it.
Concrete
So now that we have our development system we’ve started to publish some of those pieces.
We published pulito, a stake in the ground for what a “tooling and
structure” component looks like. You will note that it isn’t very complex, nothing more than an opinionated
webpack config file. Similarly we published our set of “neat” custom elements elements-sk.
We have used Redux in an experimental app that never shipped and haven’t needed
any state management libraries in the other applications we’ve ported over, so
our ‘state management’ library is still an open question.
Example
What is like to use this stack? Let’s start from an empty directory
and start building a web app:
$ npm init
$ npm add pulito
We are starting from scratch so use the project skeleton that pulito provides:
$ unzip node_modules/pulito/skeleton.zip
$ npm
We can now run the dev server and see our running skeleton application:
$ make serve
Now let’s add in elements-sk and add a set of tabs to the UI.
$ npm add elements-sk
Now add imports to pages/index.js to bring in the elements we need:
<body><tabs-sk><buttonclass=selected>Some Tab</button><button>Another Tab</button></tabs-sk><tabs-panel-sk><div><p> This is Some Tab contents.</p></div><div>
This is the contents for Another Tab.
</div></tabs-panel-sk><example-elementactive></example-element></body>
Now restart the dev server and see the updated page:
$ make serve
Why is this better?
Web frameworks usually make all these choices for you, you don’t
get to choose, even if you don’t need the functionality. For example, state
managament might not be needed, why are you ‘paying’ for it, where ‘paying’
means learning about that aspect of the web framework, and possibly even
having to serve the code that implements state managment even if you never
use it. With “a la carte” development you only include what you use.
An extra benefit comes when it is time to upgrade. How much time
have you lost with massive upgrades from v1 to v2 of a web framework?
With ‘a la carte’ developement the upgrades don’t have to be monolithic.
I.e. if you’ve chosen a templating library and want to upgrade to
the next version you only need to update your templates, and not have to
touch every aspect of your application.
Finally, ‘a la carte’ web development provides no “model” but the browser. Of
all the things that frameworks provide, “model” is the most problematic.
Instead of just using the browser as it is, many frameworks have their own
model of the browser, how DOM works, how events work, etc. I have gone into
depth on the issues
previously, but
they can be summarized as lost effort (learning something that doesn’t
translate) and a barrier to reuse. What should replace it? Just use the
browser, it already has a model for how to combine elements
together, and now with custom
elements v1 gives you the ability to create your own elements, you have all
you need.
One of the most important aspects of ‘a la carte’ web developement is that it
decouples all the components, allowing them to evolve and adapt to user needs
on a much faster cycle than the normal web framework release cycle allows.
Just because we’ve published pulito
and elements-sk doesn’t mean we
believe they are the best solutions. I’d love to have a slew of options to
choose from for tooling, base element sets, templating, and state management.
I’d like to see Rollup based tools that take the place of
pulito, and a whole swarm of “neat”
custom elements sets with varying levels of customizability and breadth.
What we’ve learned
We continue to learn as we build larger applications.
lit-html is very fast and all the applications we’ve ported over have been
smaller and faster after the port. It is rather pleasant to call the
render() function and know that the element has been rendered and not
getting tripped up by async rendering. We haven’t found the need for async
rendering either, but that’s not surprising. Let’s think about cases where
async rendering would make a big difference, i.e. where it would be a big
performance difference to batch up renders and do them asynchronously. This
would have to be an element with a large number of properties and each change
of the property would change the DOM expressed and thus would require a large
number of calls to render(). But in all the development we’ve done that
situation has never arisen, elements always have a small number of attributes
and properties. If an element takes in a large amount of data to display
that’s usually done by passing in a small number of complex object as
properties on the element and that results in a small number of renders.
We haven’t found the need for Shadow DOM. In fact, I’ve come to think of the
Light DOM children of elements as part of their public API that goes along
with the attributes, properties, and events that make up the ‘normal’
programming surface of an element.
We’ve also learned that there’s a difference between creating base elements
and higher level elements as you build up your application. You are not
creating bullet-proof re-usable elements at every step of development; the
same level of detail and re-usability aren’t needed as you move up the stack.
If an element looks like it could be re-used across applications then we may
tighten up the surface of the element and add more options to cover more use
cases, but that’s done on an as-needed basis, not for every element. Just
because you are using the web component APIs to build an application doesn’t
mean that every element you build needs to be as general purpose and bullet
proof as low level elements. You can use HTML Templates without using any
other web component technology. Same for template literals, and for each of
the separate technologies that make up the web components group of APIs.
I enjoyed working through the book Creating GUI Applications with wxPython by Michael Driscoll, learning various techniques for programming GUI applications in Python using wxPython.
This book is not intended to be a beginners' tutorial. The first chapter is titled "An Intro to wxPython," but it's very basic. I think anyone with a few simple wxPython apps under their belt would have no trouble with this book, but as a complete beginner to wxPython, I struggled a bit. Again, the book is not intended for complete beginners, so that's my fault.
Of the book's 14 chapters, 12 are dedicated to example applications, one per chapter. So these are not toy applications — some of them are small, but all are complete and useful as-is, and all of the code is provided. But the code isn't just dumped for you to try to figure out — it's presented in small sections, in a logical order, with an explanation of each part.
The first application is an image viewer that opens a dialog to let you pick an image file, then displays it. It's a good choice for a first example. The functionality is useful but not at all complicated, so you can focus on the boilerplate common to wxPython applications and how to put together a few widgets into a working application.
From there, the applications gradually get more involved, including a calculator, a database editor, a tarball creator, a tool to search for NASA images, and even an XML editor.
Some of the chapters introduce useful third-party wxPython add-ons, like ObjectListView which is much better than the built-in ListView.
The final chapter is about distributing your application using pyInstaller. Including this was a good decision. As a Python developer I'm happy to pipx install application, but if you're building applications with wxPython, your target users are quite likely not experienced Python developers, and a simple way to distribute and install your application is important if you want it to be used.
If you're going to build applications with wxPython, I recommend taking a look at this book and if possible, working through the examples. I'm sure you'll learn a lot. There are links to purchase digital or paper copies at the author's blog.
Disclosure: The author, Michael Driscoll, provided a digital copy of his book for review. However, the author was not involved in the writing of this review and all opinions are my own.
This is part three of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.
In this installment, we will:
Simulate collaborative work by two developers.
Use the workflow we learned in part 2 to add a new feature, and fix a new bug.
This is part two of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.
In this installment, we will:
Identify and fix a bug on a branch.
Build a new feature, also on a branch.
Use git rebase to keep our change history tidy.
Use tagging to mark new versions of our application.
For many years, we've been running an ELK (Elasticsearch, Logstash, Kibana) stack for centralized logging. We have a specific project that requires on-premise infrastructure, so sending logs off-site to a hosted solution was not an option. Over time, however, the maintenance requirements of this self-maintained ELK stack were staggering. Filebeat, for example, filled up all the disks on all the servers in a matter of hours, not once, but twice (and for different reasons) when it could not reach its Logstash/Elasticsearch endpoint. Metricbeat suffered from a similar issue: It used far too much disk space relative to the value provided in its Elasticsearch indices. And while provisioning a self-hosted ELK stack has gotten easier over the years, it's still a lengthy process, which requires extra care anytime an upgrade is needed. Are these problems solvable? Yes. But for our needs, a simpler solution was needed.
Enter rsyslog. rsyslog has been around since 2004. It's an alternative to syslog and syslog-ng. It's fast. And relative to an ELK stack, its RAM and CPU requirements are negligible.
This idea started as a proof-of-concept, and quickly turned into a production-ready centralized logging service. Our goals are as follows:
Set up a single VM to serve as a centralized log aggregator. We want the simplest possible solution, so we're going to combine all logs for each environment into a single log file, relying on the source IP address, hostname, log facility, and tag in each log line to differentiate where logs are coming from. Then, we can use tail, grep, and other command-line tools to watch or search those files, like we might have through the Kibana web interface previously.
On every other server in our cluster, we'll also use rsyslog to read and forward logs from the log files created by our application. In other words, we want an rsyslog configuration to mimic how Filebeat worked for us previously (or how the AWS CloudWatch Logs agent works, if you're using AWS).
Disclaimer: Throughout this post, we'll show you how to install and configure rsyslog manually, but you'll probably want to automate that with your configuration management tool of choice (Ansible, Salt, Chef, Puppet, etc.).
As of 2019, rsyslog is the default logger on current Debian and Ubuntu releases, but rsyslog-relp is not installed by default. We've included both for clarity.
Now, we need to create a minimal rsyslog configuration to receive logs and write them to one or more files. Let's create a file at /etc/rsyslog.d/00-log-aggregator.conf, with the following content:
If needed, we can listen on one or more additional ports, and write those logs to a different file by appending new ruleset and input settings in our config file:
You'll probably want to rotate these logs from time to time as well. You can do that with a simple logrotate config. Create a new file /etc/logrotate.d/rsyslog_aggregator with the following content:
To customize this configuration further, look at the logrotate man page (or type man logrotate on your UNIX-like operating system of choice).
Sending Logs to Our Central Server
We can also use rsyslog to send logs to our central server, with the help of the imfile module. First, we'll need the same packages installed on the server:
sudo apt install rsyslog rsyslog-relp
Create a file /etc/rsyslog.d/90-log-forwarder.conf with the following content:
# Poll each file every 2 seconds
module(load="imfile" PollingInterval="2")
# Create a ruleset to send logs to the right port for our environment
module(load="omrelp")
ruleset(name="send_to_remote") {
action(type="omrelp" target="syslog" port="12514") # production
}
# Send all files on this server to the same remote, tagged appropriately
input(
type="imfile"
File="/home/myapp/logs/myapp_django.log"
Tag="myapp_django:"
Facility="local7"
Ruleset="send_to_remote"
)
input(
type="imfile"
File="/home/myapp/logs/myapp_celery.log"
Tag="myapp_celery:"
Facility="local7"
Ruleset="send_to_remote"
)
Again, I listed a few example log files and tags here, but you may wish to create this file with a configuration management tool that allows you to templatize it (and create each input() in a Jinja2 {% for %} loop, for example).
Be sure to restart rsyslog (i.e., sudo service rsyslog restart) any time you change this configuration file, and inspect /var/log/syslog carefully for any errors reading and/or sending your log files.
Watching & Searching Logs
Since we've given up our fancy Kibana web interface, we need to search logs through the command line now. Thankfully, that's fairly easy with the help of tail, grep, and zgrep.
To watch logs come through as they happen, just type:
tail -f /data/logs/staging.log
You can also pipe that into grep, to narrow down the logs you're watching to a specific host or tag, for example:
Of course, you could search all logs from all time with the same method, but that might take awhile:
zgrep myapp_django /data/logs/staging.log.*.gz
Conclusion
There are a myriad of ways to configure rsyslog (and centralized logging generally), often with little documentation about how best to do so. Hopefully this helps you consolidate logs with minimal resource overhead. Feel free to comment below with feedback, questions, or the results of your tests with this method.
This is part one of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.
In this installment, we will:
Talk a bit about the design of the example application.
Ensure we are set up for development.
Exercise the basics of pytest, and git by writing some tests, adding a fixture, and committing our changes.
This is part one of a three-part series. This is a comprehensive guide to a basic development workflow. Using a simple, but non-trivial web application, we learn how to write tests, fix bugs, and add features using pytest and git, via feature branches. Along the way we'll touch on application design and discuss best practices.
In this installment, we will:
Talk a bit about the design of the example application.
Ensure we are set up for development.
Exercise the basics of pytest, and git by writing some tests, adding a fixture, and committing our changes.
Finally the admin interface to Stream is a
PWA that supports
the Web Share Target API,
which means I can trivially share content to Stream using the native Android
Share intent.
The code for Stream is on GitHub
and I’ve endeavored to make it customizable via the config.json file, but no
guarantees since I just got it all working today.
PyCon 2019 attracted 3,393 attendees, including a group of six Cakti. When we weren’t networking with attendees at our booth, we attended some fascinating presentations. Below are some of our favorites. You can watch these talks and more on the PyCon 2019 YouTube channel.
I attended four tutorials and this was the last one on Thursday. Each tutorial is hands-on and lasts about three hours. By far, this was my favorite, primarily due to the exercise-based format. Kevin Markham was well organized and a great teacher, most likely because he runs Data School, which I discovered is in Asheville! The tutorial centered around analyzing the TED Talks dataset from Kaggle. Kevin live-coded each lesson, demonstrating best practices for slicing and analyzing the dataset with Pandas. Then he turned it over to us, providing several possible exercises with increasing levels of difficulty, which utilized the tools he just taught. I found the in-person exercises valuable as they made us practice the techniques right then, and therefore, learn through experience. I overbooked myself with four tutorials and didn’t have enough time to practice what I learned in all of them, so I felt like I got the most out of Kevin’s format, and I look forward to future tutorials with him. The full video of the tutorial is below, or you can view a condensed version.
This was the first talk I attended. I was excited to learn about using maps in Jupyter Notebooks, as I hadn’t had the chance to do so yet. Christy Heaton’s talk was very accessible with an easy to follow hypothetical question of: In what cities will we be able to see upcoming solar eclipses? After starting with a brief intro on eclipses, spatial data, and coordinate systems, she walked through a Jupyter Notebook, demonstrating the ease of mapping data in a notebook with geopandas. Eventually piecing together cities, eclipse paths, and years to show which cities would be best for viewing the 2024 eclipse over the U.S. I learned a lot from this talk, especially how easy it is to use matplotlib to visualize DataFrames with geometries. It’s great to see how notebooks can be used to easily explore spatial data with Pandas and Geopandas.
Sarah Withee spoke from the heart on this topic. She described the OpenAPS (open artificial pancreas system) which was created from an open source project that also involved hardware. It combines glucose monitors and insulin pumps to automatically manage insulin levels for those with type 1 diabetes. It was a project that the medical device companies didn’t want to take on because they didn’t think it’d be profitable. So Sarah and others took the matter into their own hands. From the talk, I learned a lot about how the pumps work and interact with each other and I saw how life-changing it is. Sarah is an adopter of the device, and she spoke about how her blood sugar levels went from being all over the place to being much more stable. The project really speaks to the power of people and open source. Now, the medical device companies are finally trying to incorporate it into their devices.
Shannon Zhu from Instagram provided an informative overview of how Instagram has used type hints with the Pyre library to improve the security of millions of lines of Python. Type hints are a new area for me. This talk convinced me that starting to use them would be beneficial — not for improving performance, because they aren’t (yet?) used for that, but for the sake of validating the software’s interface security, particularly in terms of ensuring that “taints” from user input are “cleaned” before they are used.
Rachael Tatman from Kaggle gave an excellent overview of the different techniques of data science and encouraged people not to use deep learning until they know they need it. Machine learning has been and still is all the rage, and there were a lot of data scientists at PyCon. But I was glad to see some of the more traditional statistical methods being promoted alongside the newer ones.
Async was half the buzz at PyCon this year (the other half was machine learning), but a lot of places aren’t using asyncio very much yet. That’s one of the reasons I really liked this talk — not only did Neil Chazin, from Agari, give an accessible intro/overview to async programming, but he helped answer one of the key questions, which is “How do I write tests for it?” Watch his talk for the answer:
Remember, you can watch more presentations from PyCon 2019 on YouTube. Comment below on your favorites!
Pictured: I traveled to Cleveland, OH, for PyCon 2019, where I got this shot of the city skyline.
This year I attended PyCon for the first time. It’s rather amazing that I haven’t been before, since I’ve been using Python professionally for over 15 years. PyCon 2019 was held in Cleveland from May 1–9. There was so much to take in, and there are so many good things to say about it. It was a fantastic experience. But rather than provide a “mission report: 2019” a la Winter Soldier, I thought I’d do something more useful — write a guide to PyCon from a newbie perspective. Here are six lessons I learned from my first PyCon.
1. Register Early
PyCon regularly sells out, and the organizers have stated no desire to increase the size of the show (about 3,400 participants). We’re not yet to WWDC levels (Apple’s developer conference that sells out moments after registration opens), but it’s a good idea to register for PyCon as early as possible.
I learned this the hard way. By the time I was ready to register in mid-April, the conference was already sold out. Fortunately, Caktus Group, as a sponsor, had an extra conference pass available, so I was able to go after all. Note to self: Next year, register in February.
2. Open a Space
One of the most attractive aspects of PyCon is the open spaces. It’s very simple: There are several rooms available throughout the conference. The organizers put out boards with a schedule for each full conference day, and some cards and pens. Anyone (anyone!) can write a session topic on a card and post it on the board for a particular time and room. Then if anyone else is interested in your topic, they show up.
This simple setup is really inspiring. I’ve been to a lot of conferences over the years, and I’m always impressed that thousands of people will travel hundreds or even thousands of miles (and, at PyCon this year, from 59 countries) to sit in conference rooms and … passively listen to other people talk. I’ve often thought at such events, I could do this at home. Why don’t we have more open conversations? At PyCon, we can! This was one of my favorite aspects of the conference.
I attended several interesting open spaces, and there were several others that I wish I could have attended. Of note are:
Caktus’ own Scott Morningstar hosted a game called WINTERHORN. This live action game is a simulation in which you try to use a variety of bad techniques to disrupt people’s lives. If you win: Congratulations, you’ve done evil in the world! Then you go away and start seeing these techniques in use, and you’re wiser to the ways of the world. (You can read more about Scott and his passion for live action role play games.)
Vince Salvinno from CodeRed hosted an open space on Wagtail CMS. It was a great opportunity to talk about how different people are using the CMS. There were also several people present who were new to or just exploring Wagtail, so we were able to answer their questions and help them get enthused about this excellent CMS framework.
I went to the open space on Falcon Framework, a very fast lightweight API framework. Here, I was the new person exploring something for the first time. The team spent much of the session talking about the framework in general terms for newbies like me, as well as plans for the upcoming releases (async was a big topic). I was so impressed with both the team and the framework that I decided to join them during the sprints (more on that below).
I also dove right in and hosted an open space myself — “Making Wagtail Better for Developers.” At Caktus we use Wagtail to provide our customers with a great content editing experience. We have another post on why Caktus loves Wagtail. It really is an excellent CMS, but there are a few aspects that could be made better, especially when it comes to data migrations, importing and exporting content, and editing content with nested StreamFields. I wanted to talk about a few of those issues with other developers and compare notes. Apparently, others wanted to talk about these things, too, because they showed up!
We made Wagtail better. (photo by Colin Copeland)
We had a good conversation about Wagtail and its development, and I was inspired to help make Wagtail better for developers by contributing to the project (I plan to talk about my work on data migrations in an upcoming blog post).
That is one of the best outcomes from a conference like this: You come away inspired to do better work.
3. Try Some Talks
When most people go to conferences, they focus on the talks, and while talks and keynotes are a great source of inspiration and learning, you can wear yourself out trying to go to them all. PyCon had a packed schedule of talks and keynotes (upcoming blog posts will go over some of them). Instead of maxing yourself out on all the talks, my recommendation is to pick only two or three talks to attend each day. Pick ones that will expand your knowledge and help you to think about how to grow in the coming year. Many of the scheduled speakers make their slides and recordings freely available after the conference, so if you can’t get to a talk during PyCon, you can continue the conference later in the comfort of your own home.
4. Nurture Your Network
It’s all about the community, they say, and seasoned conference goers know that conferences are one of the best opportunities to “make new friends, and keep the old.” Frankly, I put a higher priority on relationship-building activities than going to talks, and PyCon has a plethora of these activities to help you nurture your network:
Open Spaces: Good ones are engaging and relational. They aren’t just lectures.
Breakfast and Lunch: Every table is large, and you can join any conversation. Don’t miss the opportunity to talk with someone new. Like I did ... sleeping late and missing breakfast. At least I always got to lunch and met new people there.
Dinner:If you meet a new friend or an old one, you can have dinner together offsite. This is a greater time commitment than breakfast or lunch, and can lead to a much deeper relationship. This is especially important for those who work remotely, like me: Having a meal or drink together is just about the only thing we can’t do on Slack. Over a meal, you can get to know people so much better, and the good vibes carry over into a better working relationship. Pictured: A group of us from Caktus had dinner one night with a couple of folks from the Truth Initiative, with whom we’ve recently been working.
Vendors / Sponsors / Exhibitors:Many of the exhibitors are developers, and all of them have periods of boredom during three days of exhibiting. If you see someone at a booth who is not talking to anyone, they are probably bored out of their wits and would welcome an opportunity to talk to you. Just walk up and say, “Hi! Can you tell me about [whatever it is that they are exhibiting ]?” It can lead to some of the best conversations you have. Also, free t-shirts. However, the Caktus sales team was not bored during the conference — whenever I went by, Tim Scales and Ian Huckabee (pictured) were always busy talking with someone.
Job Fair: The job fair, which is open to the public, is another great opportunity to talk with company representatives. You’re less likely to find someone bored, but you’ll be able to pack a lot of conversations into a couple of hours. Even if you’re not looking for a new job, it’s interesting to see what opportunities are available, and see if there is a new skill set you want to work on. You might even find the perfect opportunity. Last year, I met Caktus at the PyCon job fair (I went to the fair but not the conference itself). This year, I talked with a lot of other companies and learned more about the lay of the land. My take: There’s good work available for Python engineers who understand DevOps, can build APIs, can support data scientists, and work well in a dynamic, collaborative environment. In fact, that sounds to me a lot like working with Caktus Group. They’re hiring, by the way.
Sprints: Not only are you contributing to a project, but you are working alongside some really smart people whom you might have just met. (More on the sprints below.)
5. Eat Early
Speaking of meals, it’s imperative to arrive early to the meal line, because developers are a hungry lot, and there isn’t really extra. I learned this the hard way. One day, I was so busy sprinting and trying to get my tests working that I didn’t go to lunch until 45 minutes after the opening bell. At that point, all that was left was a large plate of potatoes. At least they were really good potatoes. Pictured: I was on time that day.
6. Sprint!
After the conference itself there are four days of sprints. These are like open spaces, but for coding. Anyone can host a sprint for their open-source project. You get a table in one of the open spaces rooms, and you code together for any length of time, up to four days. The conference organizers provide the space (with power and internet) and lunch (be quick or eat potatoes), and you provide the code. It’s a pretty sweet opportunity to get your feet wet with some new projects, or to work with your project team in person (a great opportunity when you all live in different places). It’s a great time to be with others, and despite the fact that a lot of joking and talking takes place around the table, a lot of serious coding also happens.
For my part, even though my work and family schedule didn’t allow me to stay for the entire sprint, I decided to stay through Monday (the first sprint day) and check out a couple of projects. Based on my interactions during the conference, I chose to sprint with Falcon Framework (top photo at right) for a few hours, and then switched to CodeRed CMS (bottom photo at right) for a couple of hours. I was a complete newbie on both projects, but both teams were super helpful in getting me onboard, and I was even able to commit some code to each project (see here and here) before driving home Monday evening. Sprinting with these teams was a wonderful experience for me.
Prepare for Pittsburgh
Cleveland was pretty cool. I’m already looking forward to being in Pittsburgh for PyCon 2020. I hope to see you there. Just be sure to register on time — details about the next PyCon should be released mid-summer; you can watch @PyCon on Twitter for updates.
This colostomy takedown surgery is the second in the pair of surgeries I have
had this year. If you would like to read the story of how I came to need a
colostomy takedown please read A thing that
happened.
As opposed to the first surgery which was done as an emergency procedure, this
was a planned surgery, which made a world of difference. We were able to
research and hire a patient advocate to stay with me a few nights, apply for
short term disability before the surgery, etc.
Day of surgery
I was told to arrive at 11 AM for a 2 PM surgery, but when I got there at 11
they told me that both the surgeon and anesthetist had arrived and were ready
to start, so my prep was actually pretty quick, they brought me back without
Lynne to get me dressed and an IV started before Lynne was allowed to come back
to see me, by the time they allowed her to come back and see me we only had 5
minutes together before they wheeled me off to the operating room.
The prep for the surgery went well. I have deep and abiding issues with getting
IVs, a leftover from a previous surgery years ago when I was traumatized with
an awful IV experience. This time, instead of just sucking it up, I talked to
the anesthesiologist before hand, they added notes to my chart about my issues
with IVs, and I also, on their advice, talked to the nurse giving me the IV.
She was very understanding and gave me a Lidocaine shot before giving me the
IV. This turned out to be a much better strategy than sucking it up, and the IV
was painless and easy.
Compared to the first surgery, which was an emergency surgery, I was awake for
much more of the process leading up to the surgery, even helping to move myself
onto the operating table and watching them apply straps to hold me in place. I
was also woken up in the operating room at the end of the surgery, I remember
moving off the operating table and being rolled into PACU.
Going into the surgery the goal was to do the whole thing laparoscopically, but
if that didn’t work they would have to re-open me up in same dramatic fashion
that they had for the first surgery. Also, if there was still too much swelling
or scarring on my large intesting they might also have to give me an ileostomy
that I would then need to wait yet another three months to have taken down.
Luckily all went well and when I woke up I was glad to hear that they were able
to to do the whole thing laparoscopically, and I didn’t have an ostomy of any
kind.
When coming out of surgery you initially have no pain medication in you, so
they need to judge how much pain you are in and how much pain medication to give
you. This is always a point of confusion post-op for me because I do this thing
where I shake when I’m in extreme pain, not a little shaking mind you, I’m
talking full-racking-body-swaying-the-hospital-bed shakes. This, unfortunately leads
the nurse to believe I’m cold, and not in pain, so they start wrapping me in
more and more layers of blankets. By the time they let Lynne into the PACU I
was under 10 blankets, 6 over my body and 4 around my head. She explained that
I wasn’t cold, but in pain. The nurse gave me my first dose of pain medicine,
but having just come out from under anesthesia I was a little incoherent:
Nurse: Did that help?
Me: Yes.
Lynne: You are still shaking, are you still in pain?
Me: Yes.
Lynne: So that last set of pain meds wasn't enough?
Me: What pain meds?
As you can see from this exchange that both the nurse and Lynne are saints.
After they had my pain under control they set me up with a pain pump and moved
me to my room.
The doctors and nurses told me that the more I walked the better I would heal
and I took them very seriously. I was out of PACU and in my room by 5pm and took my
first walk an hour or so later, with Lynne pushing IV pole and me with a
walker. Lynne then had to leave to take care of the kids for the night, so I
took another walk with the help of the nurse at 8PM. That exhausted me and I
took an hour nap, but that got me behind on the pain pump and I felt it when I
woke up at 9PM.
My pain was mostly in area where colostomy was closed, except for the hiccups,
which caused intense pain right below the rib cage. My Dr. explained that they
were the result of the CO2 they pumped me full of for the laparoscopic surgery
and that they should go away as my body absorbs the rest of the CO2.
I actually slept pretty well that first night, sleeping in solid blocks of 2-4
hours and then waking up to use the pain pump.
Day 2
I was serious about my walking and if I was awake and had my pain under control
I would try to get out for a walk. I was also serious about keeping hydrated,
as I had been since the first surgery, as becoming dehydrated and constipated
with a colostomy was something I dreaded. So as the day wore on I kept drinking
and walking, but after a while it felt like I couldn’t drink any more and my
stomach was getting sore. The Dr for that shift suggested I stop drinking, that
my bowels probably hadn’t woken up and that everything I was drinking was just
accumulating in my stomach. I did hold off on the liquids for the rest of the
day, and then asked my Dr. when I saw her later in the day and she agreed with
his prognosis. At this point I suggested that if I wasn’t going to drink maybe
they should increase my rate of IV fluids above the KVO (Keep Vein Open) level
so I didn’t get dehydrated. She agreed and actually changed me to IV Lactate. I
don’t think anyone would have thought of this on their own, just pointing this
out that no matter how great the care, you really need to be your own advocate.
That morning they removed both the catheter and pain pump, I apparently didn’t
use the pain pump very much, and they moved to oral pain meds, falling back to
IV injection for breakthrough pain, which didn’t happen very often. I was still
having hiccups, which were still painful, and would be triggered by coughing,
laughing, or most annoyingly, just saying the word “hiccups”.
I walked 6 more times, now without the walker since without the pain pump I was
very stable on my feet. As the day went on my stomach got better and towards
evening started to pee a lot more, around 700ml every hour or two. The IV
fluids were only coming in at 75ml/h so that wasn’t the source and I hoped that
my digestive tract had started to wake up.
I had a total of four bowel movements during the day, but they were entirely
blood clots, and each one was progressively smaller than the previous one, and
since I didn’t pass any gas they didn’t “count”. What the staff were waiting
for was me to pass gas, at which point I would be allowed to transition from
“clears” to solid foods. I know they were blood clots and not “blood” because I
dragged a nurse into the bathroom each time to inspect them and confirm that it
wasn’t “blood”.
I took 3 more walks through the evening for a total of 10 walks for the whole
day.
Day 3
Early in the morning I saw the other Dr and he said I had urinated over 3L
during the night, in addition the nurse listened to abdomen and said it sounded
very active, so I was hopeful that would be enough evidence that my bowels had
woken up.
The majority of the day was again more walks, keeping on top of oral pain meds,
and napping between walks. The only change was that it was now beautiful and I
upgraded to walking around outside.
The hiccups are gone at this point, and the pain at this point is mostly in the
site of the former colostomy and only during transitions to/from walking/lying
down.
I saw my surgeon at 3 PM and she wrote me up for solid foods, and I immediately
ate one Ritz cracker, from a sleeve I had squirreled away in my travel bad for
just such an occasion. Later they deliver the hospital dinner, which I nibbled
at, and then went for another walk and then back to sleeping. I woke later that
evening and had another small bowl of chicken noodle soup they prepared at the
nurses station, got my pain pills, and went back to sleeping.
Day 4
I took my first walk at 6 AM and had a bowl movement with real stool and a
small amount of blood clot. I also had a lot of gas coming out both ends of my
digestive tract.
Later that morning I walked down to Au Bon Pain in the hospital lobby to get a
croissant for breakfast, and I also went back down there for lunch. Yes, the
hospital food was that bad. The folks working at the Au Bon Pain seemed
oblivious to me wearing a hospital gown and pushing my IV pole as I ordered my
lunch, but some of the other patrons gave me wary looks.
I was released later that day.
Day 5
I didn’t realize that when they released me I was still on 10mg of oxy every 4
hours, but they wrote a prescription for 5mg of oxy every four hours. It took me
some time to coordinate my pills and get on an overlapping Motrin/Tylenol shift
with just oxy for the breakthrough pain.
At this point I am home and walking 2-3 times a day, where each walk is about a
mile long. A new symptom appears at this point, occasionally I will get a
muscle cramp in my abdomen around the old ostomy site, and that will slowly
spread across my entire upper abdomen. It usually only lasts a minute or two,
but it is fairly painful when it happens.
Day 7
I am no longer using the oxy for breakthrough pain.
I did have a bit of a panic this day, I had been regular and stools were
beginning to become more formed, but then I “missed”, or really just went a few
hours over when I was due for a BM, so I tried lots of things, taking a stool
softener, drinking apple juice, etc. A few hours later I had a normal BM, but
unfortunately all the things I had tried to loosen up my stools were still in
my system working away and I ended up giving myself diarrhea and re-irritating
my bowels. Fortunately that died down over the next 24 hours.
Day 11
No longer taking Motrin or Tylenol on a regular basis. At this point I am fine
if I am standing, walking, or lying down, but sitting gets uncomfortable, and
if I sit too long then when I stand up the area just under my ribs feels
uncomfortable, it’s hard to describe, bit it almost feels like my intenstines
stiffen into one position when I sit and then when I stand up they resist, in a
painful way, moving back into the standing position.
Day 12
Stopped wet packing the former colostomy site as it was almost completely
closed.
Day 14
Actually had enough brain power to do some real programming, but that only
lasted for about 30 minutes; it is shocking how much pain and healing will take
out of you and turn your brain to mush.
Day 30
First day back to work. I am still taking Motrin occasionally, usually if I
just try to do too much. I can now sit for much longer periods of time, and I
don’t get that stiffness in my abdomen when I stand, but I will still
occasionally get a muscle spasm around the old ostomy site. All my wounds are
closed at this point, but my abdomen is covered with scars, swollen in some
areas from the last surgery and distened from where the colostomy was; in
summary I look like my stomach was run over with farm equipment.
At this point I am 20 lbs lighter than when I went into the emergency room for
that first surgery. I lost 15 lbs from the first surgery and another 5 lbs from
this latest surgery. I’m fine with the weight loss, just not what I had to go
through to get here.
I just published
webmention-run, a Google
Run application written in
Go that implements
Webmention. I’m now using this to handle
webmentions on bitworking.org. Given the generous free
quota for Google Run I don’t expect this to cost me anything. This is on top of
using Firebase Hosting to host the
static (Jekyll) parts of my blog, which is also
effectively free.
Another awesome feature is that both services will provide SSL certificates; in
my case Firebase Hosting provides the cert for https://bitworking.org, and
Google Cloud Run provides the SSL cert for the subdomain where my instance of
webmention-run is running, which is the subdomain
https://webmention.bitworking.org.
The Great Famine of
1315-1317 only
lasted two years, was no where close to the change in climate that we are
looking in the face right now, and it wiped out 10-25% of the population.
To provide some measure of relief, the future was mortgaged by slaughtering
the draft animals, eating the seed grain, abandoning children to fend for
themselves (see “Hansel and Gretel”) and, among old people, voluntarily
refusing food for the younger generation to survive.
One of the things that frightens me most about climate change is that small
changes can have drastic affects and institutions can unravel much more
quickly than anyone imagines. My fear is that by the time things get bad
enough that we need to try things like geoengineering our institutions will
have fallen apart and we’ll be incapable of launching such efforts.
In terms of revenue collection, you wouldn’t want to just focus on the
ordinary income rate, because people who are wealthy have a rounding error
of ordinary income.
I would love to see the U.S. do away with categories of income (income, earned
interest, capital gains, etc) and make it all just one bucket and tax that at
a progressive rate.
Look at poor Ken, so freaked out that everyone is talking about taxes.
Obviously a firm believer in trickle-down economics, his entire screed boils down to:
To get the full context it’s useful to watch the video from the beginning
where Historian Rutger Bregman schools Michael Dell on his ignorant comment
about a top marginal tax rate of 70%:
In this essay, I will argue that the interaction of concentrated corporate
power and politics it a threat to the functioning of the free market economy
and to economic prosperity it can generate, and a threat to democracy as
well.
We found no statistically significant alphas — despite testing every
possible school with a reasonable sample size. MBA programs simply do not
produce CEOs who are better at running companies, if performance is measured
by stock price return.
But if there is no evidence that stock returns are attributable to CEOs,
then what justification is there for their stratospheric pay? How much
longer will investors and boards be fooled by randomness and hollow
credentialism?
The Django documentation recommends always starting your project with a custom user model
(even if it's identical to Django's to begin with), to make it easier to customize later if you need to. But what are you
supposed to do if you didn't see this when starting a project, or if you inherited a project without a custom user
model and you need to add one?
At Caktus, when Django first added support for a custom user model, we were still using South for migrations. Hard to
believe! Nearly six years ago, I wrote a post about migrating to a custom user model that is, of course, largely obsolete
now that Django has built-in support for database migrations. As such, I thought it would be helpful to put together a
new post for anyone who needs to add a custom user model to their existing project on Django 2.0+.
Background
As of the time of this post, ticket #25313 is open in the Django ticket tracker for adding further documentation
about this issue. This ticket includes some high-level steps to follow when moving to a custom user model, and
I recommend familiarizing yourself with this first. As noted in the documentation under Changing to a custom user model mid-project,
"Changing AUTH_USER_MODEL after you’ve created database tables is significantly more difficult since it affects foreign
keys and many-to-many relationships, for example."
The instructions I put together below vary somewhat from the high-level instructions in ticket #25313, I think (hope)
in positive and less destructive ways. That said, there's a reason this ticket has been open for more than four years —
it’s hard. So, as mentioned in the ticket:
Proceed with caution, and make sure you have a database backup (and a
working process for restoring it) before changing your production database.
Overview
Steps 1 and 2 below are the same as they were in 2013 (circa Django 1.5), and everything after that differs since we're
now using Django's built-in migrations (instead of South). At a high level, our strategy is to create a model in one of
our own apps that has all the same fields as auth.User and uses the same underlying database table. Then, we fake
the initial migration for our custom user model, test the changes thoroughly, and deploy everything up until this point
to production. Once complete, you'll have a custom user model in your project, as recommended in the Django
documentation, which you can continue to tweak to your liking.
Contrary to some other methods (including my 2013 post ), I chose this time to update the existing auth_user table
to help ensure existing foreign key references stay intact. The downside is that it currently requires a little manual
fiddling in the database. Still, if you're using a database with referential integrity checking (which you should be),
you'll sleep easier at night knowing you didn't mess up a data migration affecting all the users in your database.
If you (and a few others) can confirm that something like the below works for you, then perhaps some iteration of this
process may make it into the Django documentation at some point.
Migration Process
Here's my approach for switching to a custom user model mid-project:
Assumptions:
You have an existing project without a custom user model.
You're using Django's migrations, and all migrations are up-to-date (and have been applied to the production
database).
You have an existing set of users that you need to keep, and any number of models that point to Django's
built-in User model.
First, assess any third party apps to make sure they either don't have any references to the Django's
User model, or if they do, that they use Django's generic methods for referencing the user model.
Next, do the same thing for your own project. Go through the code looking for any references you might have to the
User model, and replace them with the same generic references. In short, you can use
the get_user_model() method to get the model directly, or if you need to create a ForeignKey or other database
relationship to the user model, use settings.AUTH_USER_MODEL (which is simply a string corresponding to the
appname.ModelName path to the user model).
Note that get_user_model() cannot be called at the module level in any models.py file (and by extension any
file that a models.py imports), since you'll end up with a circular import. Generally, it's easier to
keep calls to get_user_model() inside a method whenever possible (so it's called at run time rather than load
time), and use settings.AUTH_USER_MODEL in all other cases. This isn't always possible (e.g., when creating a
ModelForm), but the less you use it at the module level, the fewer circular imports you'll have to stumble your
way through.
Start a new users app (or give it another name of your choice, such as accounts). If preferred, you can use
an existing app, but it must be an app without any pre-existing migration history because as noted in the Django
documentation, "due
to limitations of Django’s dynamic dependency feature for swappable models, the model referenced by
AUTH_USER_MODEL must be created in the first migration of its app (usually called 0001_initial); otherwise,
you'll have dependency issues."
python manage.py startapp users
Add a new User model to users/models.py, with a db_table that will make it use the same database table as
the existing auth.User model. For simplicity when updating content types later (and if you'd like your
many-to-many table naming in the underlying database schema to match the name of your user model), you should call it
User as I've done here. You can rename it later if you like.
Create an initial migration for your new User model:
python manage.py makemigrations
You should end up with a new migration file users/migrations/0001_initial.py.
Since the auth_user table already exists, normally in this situation we would fake this migration with the
command python manage.py migrate users --fake-initial. If you try to run that now, however, you'll get an
InconsistentMigrationHistory error, because Django performs a sanity check before faking the migration
that prevents it from being applied. In particular, it does not allow this migration to be faked because other
migrations that depend on it, i.e., any migrations that include references to settings.AUTH_USER_MODEL, have
already been run. I'm not entirely sure why Django places this restriction on faking migrations, since the whole
point is to tell it that the migration has, in fact, already been applied (if you know why, please comment below).
Instead, you can accomplish the same result by adding the initial migration for your new users app to the
migration history by hand:
If you're using an app name other than users, replace users in the line above with the name of the Django app
that holds your user model.
At the same time, let's update the django_content_types table with the new app_label for our user model, so
existing references to this content type will remain intact. As with the prior database change, this change must
be made before running migrate. The reason for this is that migrate will create any non-existent content
types, which will then prevent you from updating the old content type with the new app label (with a "duplicate key
value violates unique constraint" error).
echo"UPDATE django_content_type SET app_label = 'users' WHERE app_label = 'auth' and model = 'user';"| python manage.py dbshell
Again, if you called your app something other than users, be sure to update SET app_label = 'users' in the
above with your chosen app name.
Note that this SQL is for Postgres, and may vary somewhat for other database backends.
At this point, you should stop and deploy everything to a staging environment, as attempting to
run migrate before manually tweaking your migration history will fail. If your automated deployment process runs
migrate (which it likely does), you will need to update that process to run these two SQL statements beforemigrate (in particular because migrate will create any non-existent content types for you, thereby preventing
you from updating the existing content type in the database without further fiddling). Test this process thoroughly
(perhaps even multiple times) in a staging environment to make sure you have everything automated correctly.
After testing and fixing any errors, everything up until this point should be deployed to production (and/or any
other environments where you need to keep the existing user database), after ensuring that you have a good backup
and a process for restoring it in the event anything goes wrong.
Now, you should be able to make changes to your users.User model and run makemigrations / migrate as
needed. For example, as a first step, you may wish to rename the auth_user table to something in your users
app's namespace. You can do so by removing db_table from your User model, so it looks like this:
classUser(AbstractUser):pass
You'll also need to create and run a new migration to make this change in the database:
That should be it. You should now be able to make other changes (and create migrations for those changes) to your custom
User model. The types of changes you can make and how to go about making those changes is outside the scope of this
post, so I recommend carefully reading through the Django documentation on substituting a custom User model. In the event you
opt to switch from AbstractUser to AbstractBaseUser, be sure to create data migrations for any of the fields
provided by AbstractUser that you want to keep before deleting those columns from your database. For more on this topic,
check out our post about DjangoCon 2017, where we link to a talk by Julia M Looney titled
"Getting the most out of Django’s User Model." My colleague Dmitriy also has a great post with some other suggestions for picking up old projects.
Once again, please test this carefully in a staging environment before attempting it on production, and make sure you
have a working database backup. Good luck, and please comment below with any success or failure stories, or ideas
on how to improve upon this process!
Pictured: Scott, Kat, and Tim take a quick break for a game of cards.
It may be no surprise that there are gamers among our Caktus crew, but you may be surprised by the type of games that Cakti play. From the ancient art of Mahjong to the modern fun of Pokemon, our team members cover it all.
Mahjong Master & Crossword Champ Tim Scales
Tim’s grandmother taught him to play Mahjong when he was about 10 years old. She learned to play the Chinese tile-based game in the ‘50s while living in Singapore, and she continued playing after she moved back to England. As a result, it became a regular family past time. Tim enjoys the speed, complexity, and competitiveness of Mahjong. “It’s a hard game for beginners to pick up, so while I love bringing new people into the game, it’s also a pleasure to play at the breakneck speed of experienced players,” he said.
In addition to Mahjong, Tim’s grandmother also instilled in him a love for crossword puzzles. “When I went to visit her in England, we would complete the Daily Telegraph crossword every morning over breakfast. She lived to be over 100 and was sharp until the end. She swore that doing the crossword was her secret. I view it as an investment in my future mental sharpness,” said Tim, who’s now a daily New York Times crossword solver.
According to the NYT Crossword app, Tim has completed over 800 puzzles, not including the hundreds he previously did on paper! And there’s no stopping him now. He’s on a 15-month streak of solving the NYT crossword (almost) every day. Coincidentally, another Cakti, Karen Tracey created NYT crosswords that were published from 2003 - 2010. “I went into the archives and found a puzzle that Karen created, and it was hard! It took about twice my average time to complete it,” Tim said. (Go Karen!)
Live Action Storyteller Scott Morningstar
In 1977, Scott’s uncle gave him and his brother Jason the original Dungeons & Dragons boxed set, which kickstarted a lifelong interest in tabletop and role-playing games. Jason began designing these types of games when he was in high school, and Scott tested them, providing feedback and suggestions. Several of Scott’s favorite games are ones that his brother created. In 2005, Jason founded Bully Pulpit Games.
Thanks in large part to D&D, Scott is especially into live-action role-playing games (or LARP games), which involve improvisation, storytelling, and acting. It’s the perfect fit for Scott, who says he was a “theater geek”. He prefers the genre of American Freeform LARP, which can last anywhere from 90 minutes to 4 hours. These games involve freeform story and character development. Scott especially enjoys LARPS based on historical events, and true stories, or ones that involve complex scenarios like time travel. “A lot of people play a game to win, but we play for the best story,” Scott explained. “I like to tell a good story. If you watch us play, you’ll see that what we do is closer to improv theatre than gaming.”
For 20 years, Scott and his brother have been involved with a weekly gaming group, and they regularly test games that Jason creates. They’ve also tested games made by designers around the world. “My favorite part of it all is spending time with my brother. Gaming has kept us connected,” Scott said. It’s also an interest that Scott has passed down to his sons, who are also into gaming.
Scott enjoys sharing his gaming interest and bringing other players together. He and his brother founded LARP Shack, which attracts players from as far as Washington, D.C., and Philadelphia. Scott has hosted LARP Shack events at Caktus since August 2017, and the next one will be held on April 27 — details in this private Facebook group and on the Caktus event calendar. Scott also organizes Local Area Network (LAN) parties at Caktus. These parties are a blast from the past! Attendees bring their own computer, which they connect to form a LAN to play online games. It’s a bit like playing online games during the days of dial-up. Scott plans to host the next LAN party sometime this spring.
Video Game Enthusiast Kat Smith
Kat can’t recall a time when she didn’t have a gaming console (or two, or three, or more!). As a kid, she played Bubble Bobble on a classic Nintendo with her grandmother, and she often watched her brother play Final Fantasy X. Other consoles she had growing up included Super Nintendo, Sega Genesis, Nintendo 64, PlayStation2 and 3. Currently, Kat is more interested in PC gaming, but she still owns a PlayStation 4 Pro, Nintendo Switch, and a Wii U. Kat's cat (pictured) also enjoys watching the video games.
She used to play a lot of multiplayer online battle arena (MOBA) games, “but the older I get, the more I like relaxing games,” Kat said. Now she’s into world builder, simulation, and visually appealing games. She enjoys open-ended games, and a few of her favorite video games are the Witcher 3, Skyrim, Civilization V, and Banished. She’s also a regular Minecraft player, and has it for PC, phone, and console. “Every 6 months or so I play Minecraft, mostly to see what updates have been made to the game. It's a really relaxing game where you can do whatever you want. And it's very customizable!” Kat said.
Kat has also been active in the Pokemon Go community since July 2016, and she plays daily. She’s now at level 33 (out of 40)! She sometimes participates in raids with fellow Cakti Dmitriy Chukhin. The Civil Rights Mural located outside of the Caktus office is a Pokemon Gym so there's almost always an opportunity to find a Pikachu or a Magikarp on her way in and out of the office. On weekends, Kat enjoys walking around town with her fiance and their dog, checking the poke stops on their route. If Kat were a Pokemon, she said she’d be a Snorlax or a Slaking!
Pictured: The final rush is on! Staff quickly check materials for our PyCon booth.
PyCon 2019 is almost here, and we’re excited to continue to sponsor this premier Python event, which takes place in Cleveland, OH, from May 1 - 9. PyCon attracts attendees from around the world, and for the first time, the conference will include a track of Spanish talks.
Caktus at PyCon
Connecting with the Python community is one of our favorite parts of participating in PyCon. We love to catch up with people we’ve met before and see new faces, too! We’ll be in the Exhibit Hall at booth 645 on May 2 - 4, where we’ll have swag, games, and giveaways.
Some of you may remember our Ultimate Tic Tac Toe game from previous years. Only a few committed players were able to beat the AI opponent last year. This year, any (human) champions will earn a Caktus hoodie and be entered into a drawing to win a Google AIY Vision Kit and a Google AIY Voice Kit.
Must-See Talks & Events
PyCon consistently attracts top-notch speakers who present on a variety of informative topics. Our team is especially looking forward to the following:
Check out the full schedule of talks. Some of these will likely appear in our follow-up PyCon Must-See Talks series, so if you can’t make it to the event, check back in June for our top picks.
Open Spaces: Beyond the scheduled talks, our Technology Support Specialist Scott Morningstar is looking forward to the Open Spaces sessions, which are self-organizing, meetup-like events. Scott plans to run a game of WINTERHORN during one of the open spaces times. The live-action game allows players to reflect on the government and opportunities for activism. “I’m not sure if playing WINTERHORN will make you a better developer, but it may make you a better citizen, or at least better informed about what is happening in the world,” Scott said. Read more about Scott's passion for games like WINTERHORN.
Arts Festival: This year, PyCon includes a mini arts festival called The Art of Python, which will “showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python).” With his background in STEAM education (STEM + the Arts), account executive Tim Scales is particularly excited about the arts festival, which will provide a creative complement to the technical presentations and lectures.
Job Fair Open to Public
Are you a sharp Django web developer searching for your next opportunity? Good news — we’re hiring! View the spec and apply from our Careers page. We’ll also be at table 34 during the PyCon job fair on May 5, which is open to the public, so come meet the hiring manager and learn more about what it’s like to work at Caktus.
Don’t be a Stranger!
Come see us at our booth, look for members of the Caktus team in our T-shirts during the event, or go ahead and schedule a meeting with us.
Whether you’ll be at PyCon or following along from home, we’ll tweet from @CaktusGroup. Be sure to follow us for the latest updates from the event.
Caktus Changing from Django to New COBOL-based Framework
Beginning immediately, Caktus will build new projects
using our new COBOL-based framework, ADD COBOL TO WEB.
Time-tested
We've come to realize that new-fangled languages like Python
(1989) offer nothing that more time-tested languages such as
COBOL (1959) cannot. We're tired of trying to keep up with the latest,
greatest thing (Django, 2003). Accordingly, we're going back to old,
reliable COBOL.
Flexible and Powerful
COBOL provides us with flexibility and power that Python cannot
match. A statement like "ALTER X PROCEED TO Y" can remotely modify
a GO TO statement in procedure X to target a completely different
place in the program!
What could be simpler? Statements are terminated by periods, which we've been used to since childhood. No pesky parentheses, just write
DISPLAY "Hello, world!" and the program displays Hello,
world!.
COBOL for the Web
Of course, COBOL pre-dates the web, so it doesn't come with
support for building websites. We have therefore developed
a new framework, ADD COBOL TO WEB, that links COBOL to this
new-fangled (1989) web.
We take full advantage of COBOL's built-in templating
facilities: the PICTURE clause and the Report Writer.
This is ideal for table output:
01 sales-on-day TYPE DETAIL, LINE + 1.
03 COL 3 VALUE "Sales on".
03 COL 12 PIC 99/99/9999 SOURCE sales-date.
03 COL 21 VALUE "were".
03 COL 26 PIC $$$$9.99 SOURCE sales-amount.
And the application to generating HTTP responses and
HTML content is, of course, obvious.
Open Source
We plan to share the complete source for ADD COBOL TO WEB
as soon as someone ports git to COBOL.
Customer Benefit
Our customers will appreciate our use of long-established tools
to build their websites, with future maintainability guaranteed
by the many COBOL programmers in the workforce.
We recently picked up an IBM 370 from eBay and are arranging
shipping, and additional power to our office basement. We've been
assured the 370 was state-of-the-art for COBOL in its day, and can
compile literally tens of statements per minute. Once our business takes off, we'll be able to afford a 256 KB memory expansion to speed
it up even more!
Also, keep an eye on our job postings. We will soon be looking for
experienced mainframe operators who can help us deploy our
new applications.
In this post, I review some reasons why it's really difficult to program correctly when using times, dates, time zones, and daylight saving time, and then I'll give some advice for working with them in Python and Django. Also, I'll go over why I hate daylight saving time (DST).
TIME ZONES
Let's start with some problems with time zones, because they're bad
enough even before we consider DST, but they'll help us ease into it.
Time Zones Shuffle
Time zones are a human invention, and humans tend to change their
minds, so time zones also change over time.
Many parts of the world struggle with time changes. For example, let's look at the Pacific/Apia time zone, which is the time
zone of the independent country of Samoa. Through December 29, 2011,
it was -11 hours from Coordinated Universal Time (UTC). From December 31, 2011, Pacific/Apia became
+13 hours from UTC.
What happened on December 30, 2011? Well, it never
happened in Samoa, because December 29, 23:59:59-11:00 is followed
immediately by December 31, 0:00:00+13:00.
Date
Time
Zone
Date
Time
Zone
2011-12-29
23:59:59
UTC-11
2011-12-30
10:59:59
UTC
2018-12-31
00:00:00
UTC+13
2011-12-30
11:00:00
UTC
That's an extreme example, but time zones change more often than
you might think, often due to changes in government or country boundaries.
The bottom line here is that even knowing the time and time zone, it's
meaningless unless you also know the date.
Always Convert to UTC?
As programmers, we're encouraged to avoid issues with time zones by
"converting" times to UTC (Coordinated
Universal Time) as early as possible, and convert to the local time
zone only when necessary to display times to humans. But there's a problem with that.
If all you care about is the exact moment in the lifetime of the
universe when an event happened (or is going to happen), then that
advice is fine. But for humans, the time zone that they expressed a time in can be important, too.
For example, suppose I'm in North Carolina, in the
eastern time zone, but I’m planning an event in Memphis, which is in the central time zone. I go to my calendar
program and carefully enter the date and "3:00 p.m. CST".
The calendar follows the usual convention and converts my entry to UTC
by adding 6 hours, so the time is stored as 9:00 p.m. UTC, or 21:00
UTC. If the calendar uses Django, there's not even any extra code
needed for the conversion, because Django does it automatically.
The next day I look at my calendar to continue working on my
event. The event time has been converted to my local time zone,
or eastern time, so the calendar shows the event happening at "4:00
p.m." (instead of the 3:00 p.m. that it should be). The conversion is not useful for me,
because I want to
plan around other events in the location where the event is
happening, which is using CST, so my local time zone is irrelevant.
The bottom line is that following the advice to always convert
times to UTC results in lost information.
We're sometimes better off storing times with their non-UTC time zones. That's why it's kind of annoying that Django always "converts" local times to UTC before saving
to the database, or even before returning them from a form.
That means the original timezone is lost unless you go to the
trouble of saving it separately and then converting the time from the
database back to that time zone after you get it from the
database. I wrote about this before.
By the way, I've been putting "convert" in scare quotes because talking
about converting times from one time zone to another carries
an implicit assumption that such converting is simple and loses
no information, but as we see, that's not really true.
DAYLIGHT SAVING TIME
Daylight saving time (DST) is even more of
a human invention than time zones.
Time zones are a fairly obvious adaptation to the conflict between how
our bodies prefer to be active during the hours when the sun is up,
and how we communicate time with people in other parts of the world.
Historical changes in time zones across the years are annoying, but since
time zones are a human invention it's not surprising that we'd tweak
them every now and then.
DST, on the other hand, amounts to changing entire time zones twice
every year. What does US/eastern time zone mean? I don't know,
unless you tell me the date. From January 1, 2018 to March 10, 2018, it
meant UTC-5. From March 11, 2018 to November 3, 2018, it meant UTC-4.
And from November 4, 2018 to December 31, 2018, it's UTC-5 again.
The Uniform Time Act of 1966 ruled that daylight saving time
would run from the last Sunday of April until the last Sunday
in October in the United States. The act was amended to make
the first Sunday in April the beginning of daylight saving
time as of 1987. The Energy Policy Act of 2005 extended
daylight saving time in the United States beginning in 2007.
So local times change at 2:00 a.m. EST to 3:00 a.m. EDT on
the second Sunday in March and return at 2:00 a.m. EDT to
1:00 a.m. EST on the first Sunday in November.
So in a little over 50 years, the rules changed 3 times.
Even if you have complete and accurate information about the rules,
daylight saving time complicates things in surprising ways. For
example, you can't convert 2:30 a.m. March 11, 2018. in US/eastern
time zone to UTC, because that time never happened — our clocks had to
jump directly from 1:59:59 a.m. to 3:00:00 a.m. See below:
Date
Time
Zone
Date
Time
Zone
2018-03-11
1:59:59
EST
2018-03-11
6:59:59
UTC
2018-03-11
3:00:00
EDT
2018-03-11
7:00:00
UTC
You can't convert 1:30 a.m. November 4, 2018, in US/eastern time
zone to UTC either, because that time happened twice. You would have
to specify whether it was 1:30 a.m. November 4, 2018 EDT or 1:30 a.m.
November 4, 2018 EST:
Date
Time
Zone
Date
Time
Zone
2018-11-04
1:00:00
EDT
2018-11-04
5:00:00
UTC
2018-11-04
1:30:00
EDT
2018-11-04
5:30:00
UTC
2018-11-04
1:59:59
EDT
2018-11-04
5:59:59
UTC
2018-11-04
1:00:00
EST
2018-11-04
6:00:00
UTC
2018-11-04
1:30:00
EST
2018-11-04
6:30:00
UTC
2018-11-04
1:59:59
EST
2018-11-04
6:59:59
UTC
Advice on How to Properly Manage datetimes
Here are some rules I try to follow.
When working in Python, never use naive datetimes. (Those are
datetime objects without timezone information, which unfortunately are
the default in Python, even in Python 3.)
Use the pytz library when
constructing datetimes, and review the documentation
frequently. Properly managing datetimes is not always intuitive, and
using pytz doesn't prevent me from using it incorrectly and
doing things that will provide the wrong results only for some inputs, making it
really hard to spot bugs. I have to triple-check that I'm following the
docs when I write the code and not rely on testing to find problems.
Let me strengthen that even further. It isnot possibleto
correctly construct datetimes with timezone information using
only Python's own libraries when dealing with timezones that
use DST. I must use pytz or something equivalent.
If I'm tempted to use datetime.replace, I need to stop, think
hard, and find another way to do it. datetime.replace is almost
always the wrong approach, because changing one part of a datetime without
consideration of the other parts is almost guaranteed to not do what I expect
for some datetimes.
When using Django, be sure USE_TZ = True.
If Django emits warnings about naive datetimes being saved in the
database, treat them as if they were fatal errors, track them down,
and fix them. If I want to, I can even turn them into actual fatal
errors; see this Django documentation.
When processing user input, consider whether a datetime's original
timezone needs to be preserved, or if it's okay to just store the
datetime as UTC. If the original timezone is important, see this post I wrote about how to get and store it.
Conclusion
Working with human times correctly is complicated, unintuitive,
and needs a lot of careful attention to detail to get right. Further, some of the oft-given advice, like always working in UTC, can cause problems of its own.
New clients regularly ask us if we build WordPress sites. When we dig deeper, we generally learn that they’re looking for a user-friendly content management system (CMS) that will allow them to effortlessly publish and curate their site content. As we’ve written about previously, WordPress can be a good fit for simple sites. However, the majority of our clients need a more robust technical solution with customizable content management tools. For the Python-driven web applications that we develop, we love to work with Wagtail.
What is Wagtail?
Wagtail is a Python-driven CMS built on the Django web framework. It has all the features you’d expect from a quality CMS:
intuitive navigation and architecture
user-friendly content editing tools
painless image uploading and editing capabilities
straightforward and rapid installation
What Makes Wagtail Different?
From the user’s perspective, Wagtail’s content editor is what sets it apart, and it’s why we really love it. Most content management systems use a single WYSIWYG (“what you see is what you get”) HTML editor for page content. While Wagtail includes a WYSIWYG editor — the RichTextField — it also has the Streamfield, which provides an interface that allows you to create and intermix custom content modules, each designed for a specific type of content.
What does that mean in practice? Rather than wrangling an image around text in the WYSIWYG editor and hoping it displays correctly across devices, you can drop an image into a separate, responsive module, which has a custom data model. In other words:
As a user, you don’t need to customize your content to the capabilities of your CMS. Instead, you customize your CMS to maximize your content.
Let’s Take a Look
Below is a screenshot of the WordPress editing dashboard, with the single HTML content area. You can edit and format text, add an image, and insert an HTML snippet — all the basics.
Now take a look at Wagtail. Each type of content has its own block — that’s Streamfield at work. The icons at the bottom display the developer-defined modules available, which in this case are Heading block, Paragraph block, Image block, Block quote, and Embed block. This list of modules can be extended to include a variety of custom content areas based on your specific website.
Using the blocks and modules, a web content editor can quickly add a paragraph of text, followed by an image, and then a blockquote to create a beautiful, complete web page. To demonstrate this better, we put together a short video of Streamfields in action.
You can also learn more about Streamfields and other features on the Wagtail website.
Powered by Django
At its core, Wagtail is a Django app, meaning that it seamlessly integrates with other Django and Python applications. This allows near-endless flexibility to extend your project with added functionality. For example, if your application includes complex Python-based data analysis on the backend but you want to easily display output to site visitors, Wagtail is the ideal choice for content management.
The Bottom Line
Wagtail provides content management features that go above and beyond the current abilities of a WordPress site, plus the inherent customization and flexibility of a Django app. We love working with Wagtail because of the clear advantages it provides to our clients and content managers. We highly recommend the Wagtail CMS to all our clients.
Contact us to see if Wagtail would be a good fit for your upcoming project.
Pictured: Our library of reference books at Caktus cover topics including Django and Python, as well as project management and Agile methodologies.
At Caktus, we believe in continued learning (and teaching). It's important to read up on the latest industry trends and technologies to stay current in order to address our clients' challenges. We even maintain a library in our office for staff use, and we add references frequently. Our team enjoys sharing what they've learned by contributing to online resources, such as the Django Documentation and the Mozilla Developer Network Web Docs. Below is a list (in alphabetical order) of the books, blogs, and other documents that we’ve found to be the most accurate, helpful, and practical for Django development.
Overview: When Dmitriy first began learning about Django, he went through the official Django tutorial. Then, as a developer, he read through other pieces of documentation that are relevant to his work.
A Valuable Lesson: Dmitriy learned that detailed documentation makes working with a framework significantly easier than trying to figure it out on his own or from other developers’ posts about their errors. The documentation is readable, uses understandable language, and gives useful examples, making Django Documentation a lot friendlier than Dmitriy expected. It encouraged him to continue using it, since other core developers consider it important to make their software usable and well-documented. One thing that’s particularly helpful about the Django documentation is that pages now have a ‘version switcher’ in the bottom right corner of the screen, allowing readers to switch between the versions of Django for a specific feature. Since our projects at Caktus involve using a number of different versions of Django, it’s helpful to switch between the documentation to see when a feature was added, changed, or deprecated. Seeing the documentation on the Django Documentation site also encouraged Dmitriy to thoroughly document the code he writes for people who will work with it in the future.
Why You Should Read This: The Django tutorial is a great place to begin learning about using Django. The reference guide is best for those who are already using Django and need to look up details on how to use forms, views, URLs, and other parts of the Django API. The topic guides provide high-level explanations.
Overview: The Django User’s Group is a public Google Group that Karen found when she first started using Django in 2006 and ran into some trouble with database tables. She posted her challenges and questions on the Google Group and received a response the same day. She’s been using Django ever since — coincidence?
A Valuable Lesson: Django Users was Karen’s first introduction to the Django community and she learned a great deal from it. It was also her entry into becoming a regular contributing member of the community.
Why You Should Read This: If you have a Django puzzle that you can’t solve, searching the group and (if that fails to yield results) writing up and posting a question is a great way to get a solution. Karen also notes that sometimes it’s not even necessary to post since the act of writing the question in a way others can understand sometimes makes the answer clear! Reading various posts in the group is also a way to see the issues that trip up newcomers, and trying to solve questions by others also provides helpful learning opportunities.
Overview:High Performance Django, proclaims to “give you a repeatable blueprint for building and deploying fast, scalable Django sites.” Neil first learned about this book from friend and former coworker Jeff Bradberry, who pointed it out as a way to start pushing his Django development skills beyond a firm grasp of the basics.
A Valuable Lesson: Neil learned that making Django perform at scale means keeping the weight off Django itself. The book taught him about making effective use of the high-performance technologies that make up the rest of the stack to respond to browser requests as early and quickly as possible. It taught him that there’s more to building web apps with Django than just Django, and it opened the door to thinking and learning about many other features of the web app development landscape.
Why You Should Read This: This book is ideal for anyone who’s beginning a career in web app development. It’s especially helpful for those with a different background, whether it’s front-end development or something further afield like computational linguistics. It’s easy to lose sight of the forest for the trees as a new web developer, and this book manages to provide you with a feel for the big picture in a surprisingly small number of pages.
Overview: The Mozilla Developer Network (MDN) Web Docs are a popular resource when it comes to nearly any general web development topic. It’s authored by multiple contributors, and you can be an author, too. Vinod usually visits the site when he’s struggling with a piece of code, and the MDN pops up at the top of his web search results. Caktus especially loves the MDN because we were fortunate to work with Mozilla on the project that powers the MDN.
A Valuable Lesson: Vinod and his team used Vue.js on a recent project, and he learned a lot more about modern Javascript than he needed to know in the past. One specific topic that has confused him was Javascript Promises. Fortunately, the MDN has documentation on using promises and more detailed reference material about the Promise.then(). Those two pieces of documentation cleared up a lot of confusion for Vinod. He also likes how each page of reference documentation includes a browser compatibility section, which helps him to identify whether his code will work in browsers that our clients use.
Why You Should Read This: The MDN provides excellent documentation on every basic front-end technology including HTML, CSS, and Javascript, among others. Since Mozilla is at the forefront of helping to create the specifications for these tools, you can trust that the documentation is authoritative. It’s also constantly being worked on and updated, so you know you’re not getting documentation on a technology that has been deprecated. Finally, and most importantly, the documentation is GOOD! They cover the basic syntax, and always include common usage examples, so that it’s clear how to use the tool. In addition, there are many other gems including tutorials (both basic and advanced) on a wide variety of web development topics.
Towards 14,000 Write Transactions Per Second on my Laptop
A Valuable Lesson: This post not only provides an overview of commit_delay and commit_siblings, but also an important change to the former that dramatically improved its effectiveness since the release of Postgres 9.3. For database servers that need to handle a lot of writes, the commit_delay setting (which is disabled by default, as of Postgres 11) gives you an efficient way to "group" writes to disk that helps increase overall throughput by sacrificing a small amount of latency. The setting has been instrumental to us at Caktus in optimizing Postgres clusters for a couple of client projects, yet Tobias rarely, if ever, sees it mentioned in more general talks and how-tos on optimizing Postgres.
Why You Should Read This: These settings will change nothing for read-heavy sites/apps (such as a CMS), but if you use Postgres in a write-heavy Django (or other) application, you should learn about and potentially configure these settings to improve the product.
Two Scoops of Django
Authors: Daniel Roy Greenfeld and Audrey Roy Greenfeld
Overview:Two Scoops of Django has several editions, and the latest is 1.11 (Dan read edition 1.8). The editions stand the test of time, and the authors go through nearly all facets of Django development. They share what has worked best for them and what to watch out for, so you don't have to learn it all the hard way. The authors’ tagline is, “Making Python and Django as fun as ice cream,” and who doesn’t love ice cream?
A Valuable Lesson: By reading the Django Documentation (referenced earlier in this post), you can learn what each setting does. Then in chapter 5 of Two Scoops, read about a battle-tested scheme for managing different settings and files across multiple environments, from local development to testing servers and production, while protecting your secrets (passwords, keys, etc). Similarly, chapter 19 covers what cases you should and shouldn't use the Django admin for, warns about using list_editable in multi-user environments and gives tips for securing the admin and customizing it.
Why You Should Read It: The great thing about the book is that the chapters stand alone. You can pick it up and read whatever chapter you need. Dan keeps the book handy at his desk, for nearly all his Django projects. The book is not only full of useful information, but almost every page also includes examples or diagrams.
Well Read
We recommend these readings on Django development because they provide valuable insight and learning opportunities. What do you refer to when you need a little help with Django? If you have any recommendations or feedback, please leave them in the comments below.
Google Cloud Storage has an officially supported fuse client!
This is something I have always wanted and would have expected for Google
Drive, but 🤷.
The only thing better than a fuse client is a fuse directory that gets
mounted automatically when you log in, which you can do fairly simply
using systemd --user, which is just systemd, but everthing runs as you.
We have a small two-person Infrastructure Ops team here at Caktus (including myself) so I was excited to go to my first devopsdays Charlotte and be surrounded by Ops people. The event was held just outside of Charlotte, at the Red Ventures auditorium in Indian Land, South Carolina. About 200 people gathered there for two days of talks and open sessions. Devopsdays are held multiple times a year, in various locations around the world. Check out their schedule to see if there will be an event near you.
On Thursday afternoon, Quintessence Anx gave an awesome technical Ignite talk on Sensory Friendly Monitoring. She packed a whole lot of monitoring wisdom into 5 minutes and 20 slides, so I was then looking forward to what she had to say about diversity. She spoke on Unquantified Serendipity: Diversity in Development, and it ended up being my favorite talk.
Quintessence (pictured right) provided a lot of actionable information and answered many common concerns that people have with diversity. She told the story of how as a junior developer her mentors often told her how to solve problems, while they told her male peers how to find answers. She suggested that mentors give a “hand up, not a hand out” and stressed the importance of being introduced to a mentors network so that the mentee can start building their own networks. I thought that the talk had the right balance between urgency and applicability.
The Friday Keynote was given by Sonja Gupta and Corey Quinn, and was titled Embarrassingly Large Numbers: Salary Negotiation for Humans. It focused on how to upgrade your income by getting a new job. This talk was informative and entertaining, including more f-bombs than all the other presentations combined. Some of the points they made were:
interview for jobs you don’t plan to take
interview at least once a quarter
never take the first offer
They also recognized that negotiation is hard, but you are not rude if you ask for what you’re worth. I was looking forward to this keynote since I recently began following Corey’s newsletter, Last Week in AWS, where he has elevated snark to an art form.
I enjoyed the event, and I am looking forward to attending devopsdays Raleigh in the fall. The next devopsdays Charlotte will take place in 2020.
What exactly my anus is supposed to be doing for those three months is a question I forget to ask.
And now I know from the questions and comments I’ve received that many of you
are curious too.
The answer turns out to be “nothing much” for most of the time, with
occassional bouts of Phantom Limb sensation.
Now the “phantom limb” in this case is my large intestines, which are no
longer hooked up to my rectum. The sensation they are missing is the pressure
that comes when stool builds up in the colon, which in turn triggers the
sensations of “needing to go”.
So one or two times a day I get the sensation of “needing to go”. I
logically know that’s an impossibility, but apparently my rectum is not swayed
by logic, so I had to look for alternative methods of getting the sensation to
go away. It turns out there is one way to get it to go away and that’s to go
and sit on the toilet as if I were having a bowl-movement. And I realize, as I
sit here, on the toilet, not having a bowl-movement, that what I’m really
doing in playing “make pretend” for my rectum. And it works!
I wonder if my rectum will be as grateful as my kids are for all the time I
spent playing “make pretend” with them?
At Caktus, we work on many projects, some of which are built by us from start to finish, while others are inherited from other sources. Oftentimes, we pick up a project that we either have not worked on in a long time, or haven’t worked on at all, so we have to get familiar with the code and figure out the design decisions that were made by those who developed it (including when the developers are younger versions of ourselves). Moreover, it is a good idea to improve the setup process in each project, so others can have an easier time getting set up in the future. In our efforts to work on such projects, a few things have been helpful both for becoming familiar with the projects more quickly, and for making the same projects easier to pick up in the future.
Getting Set Up
Here are my top tips for getting up to speed on projects:
Documentation
As a perhaps obvious first step, it can be helpful to read through a README or other documentation, assuming that it exists. At Caktus, we write steps for how to get set up locally with each project, and those steps can be helpful when getting familiar with the major aspects of a project. If there is no README or obvious documentation, we look for comments within files. The first few lines may document the functionality of the rest of the file. One thing that can also be beneficial for future developers is either adding to or creating documentation for getting set up on a project as you get set up yourself. Though you likely don’t have a lot of project knowledge at this point, it can be helpful to write down some documentation, even notes like ‘installing node8.0 here works as of 2019-01-01’.
Project Structure
If you're working on a Django project, for instance, you can look for the files that come with the project: a urls.py, models.py files, views.py files, and other such files, to illuminate the functionality that the project provides and the pages that a user can visit. For non-Django projects, it can still be useful to look at the directories within the project and try to make sense of the different parts of the application. Even large Django projects with many models.py files can provide helpful information on what is happening in the project by looking at the directories. As an example, we once began working on a project with a few dozen models.py files, each with a number of models in it. Since reading through each models.py file wasn’t a feasible option, it was helpful to see which directory each models.py file was in, so that we could see the general structure of the project.
Tests
In terms of getting familiar with the project, tests (if they exist) are a great place to look, since they provide data for how the different parts of the project are supposed to work. For example, tests for Django views give examples of what data might be sent to the server, and what should happen when such data is sent there. Any test data files (for example, a test XML file for a project that handles XML) can also provide information about what the code should be handling. If we know that a project needs to accept a new XML structure, seeing the old XML structure can save a lot of time when figuring out how the code works.
Improving the Code
Getting familiar with the code should also mean making the code friendlier for future developers. Beware that future developers may, in fact, be us in a few years, and it’s much friendlier and more efficient to start working on a project that is well-documented and well-tested, than a project that is neither. While all the code doesn’t have to be improved all at once, it is possible to start somewhere, even if it means adding comments and tests for a short function. With time, the codebase can be improved and be easier to work with.
Refactoring
Oftentimes when beginning to look at a new (or unfamiliar) project, we get the urge to begin by refactoring code to be more efficient or more readable, or just to be more modern. However, it has been more helpful to resist this urge at the beginning until we understand the project better, since having working code is better than not working code. Also, there are often good reasons for why things were written a certain way, and changing them may have consequences that we are not currently aware of, especially if there aren’t sufficient tests in the project. It may be helpful to add comments to the code as we figure out how things work, and instead focus on tests, leaving refactoring for a future time.
Testing
Testing is a great place to start improving the codebase. If tests already exist, then working on a feature or bugfix should include improving the tests. If tests don’t exist, then working on a feature or bugfix should include adding relevant tests. Having tests in place will also make any refactoring work easier to do, since they can be used to check what, if anything, broke during the refactoring.
Documentation
As mentioned above, documentation makes starting to work on a project much easier. Moreover, working through getting the project set up is a great time to either add or improve a README file or other setup documentation. As you continue working on the code, you can continue to make documentation improvements along the way.
Conclusion
Having made these recommendations, I should also acknowledge that we are often faced with various constraints (time, resources, or scope) when working on client projects. While these constraints are real, best practices should still be followed, and changes can be made while working on the code, such as adding tests for new functionality, or improving comments, documentation, and tests for features that already exist and are being changed. Doing so will help any future developers to understand the project and get up to speed on it more efficiently, and this ultimately saves clients time and money.
Pictured: Developer Dan Poirier is an advocate for WCPE and a volunteer announcer. WCPE is one of the recipients of our charitable giving program.
We are pleased to continue serving the North Carolina community at-large through our semi-annual Charitable Giving Program. Twice a year we solicit proposals from our team to contribute to a variety of non-profit organizations. With this program, we look to support groups in which Cakti are involved or that have impacted their lives in some way. This gives Caktus a chance to support our own employees as well as the wider community. For winter 2018, we were pleased to donate to the following organizations:
ARTS North Carolina
ARTS North Carolina “calls for equity and access to the arts for all North Carolinians, unifies and connects North Carolina’s arts communities, and fosters arts leadership.” Our Account Executive Tim Scales has been a board member and supporter of this organization for several years.
The Museum of Life and Science’s mission is to “create a place of lifelong learning where people, from young child to senior citizen, embrace science as a way of knowing about themselves, their community, and their world.” Our Chief Business Development Officer Ian Huckabee is a current museum board member and sits on the executive and finance committees.
Sisters’ Voices is a “choral community of girls, within which each is known and supported while being challenged to grow as a musician and as a person.” Caktus Developer Vinod Kurup’s niece Vishali and daughter Anika (pictured from left to right) are members of the Sisters' Voices choir. Vinod believes that being a member of the choir has “enriched their lives and taught them the love of music, and they developed an appreciation of their own voice.”
WCPE is a non-commercial, independent, listener-supported radio station, dedicated to excellence in classical music. Broadcasting includes service to the Piedmont area, Raleigh, Durham, and Chapel Hill on 89.7 FM. Their facility is staffed 24 hours a day, 7 days a week.
“WCPE gained the distinction of being the only public radio station in the eastern half of North Carolina to stay on the air during Hurricane Fran in 1996, acting as an Emergency Broadcast Relay station, providing weather information directly from the National Weather Service.” Caktus Developer Dan Poirier is an advocate for WCPE and has been listening and donating for years. Last year, he trained as a volunteer announcer and now commutes 90 miles round-trip, 2-3 times a month to work a shift on the air.
Caktus’ next round of giving will be June 2019, and we look forward to supporting another group of organizations that are committed to enriching the lives of North Carolinians!
The last time we had a 70% top marginal tax rate in the U.S.
it generated very little revenue. That doesn’t mean it failed,
that means it was doing it’s job, as explained in
The opportunity cost of firm payouts.
As part of our work to make sharp web apps at Caktus, we frequently create API endpoints that allow other software to interact with a server. Oftentimes this means using a frontend app (React, Vue, or Angular), though it could also mean connecting some other piece of software to interact with a server. A lot of our API endpoints, across projects, end up functioning in similar ways, so we have become efficient at writing them, and this blog post gives an example of how to do so.
A typical request for an API endpoint may be something like: 'the front end app needs to be able to read, create, and update companies through the API'. Here is a summary of creating a model, a serializer, and a view for such a scenario, including tests for each part:
Part 1: Model
For this example, we’ll assume that a Company model doesn’t currently exist in Django, so we will create one with some basic fields:
Writing tests is important for making sure our app works well, so we add one for the __str__() method. Note: we use the factory-boy and Faker libraries for creating test data:
# tests/test_models.pyfromdjango.testimportTestCasefrom..modelsimportCompanyfrom.factoriesimportCompanyFactoryclassCompanyTestCase(TestCase):deftest_str(self):"""Test for string representation."""company=CompanyFactory()self.assertEqual(str(company),company.name)
With a model created, we can move on to creating a serializer for handling the data going in and out of our app for the Company model.
Part 2: Serializer
Django Rest Framework uses serializers to handle converting data between JSON or XML and native Python objects. There are a number of helpful serializers we can import that will make serializing our objects easier. The most common one we use is a ModelSerializer, which conveniently can be used to serialize data for Company objects:
That is all that’s required for defining a serializer, though a lot more customization can be added, such as:
outputting fields that don’t exist on the model (maybe something like is_new_company, or other data that can be calculated on the backend)
custom validation logic for when data is sent to the endpoint for any of the fields
custom logic for creates (POST requests) or updates (PUT or PATCH requests)
It’s also beneficial to add a simple test for our serializer, making sure that the values for each of the fields in the serializer match the values for each of the fields on the model:
# tests/test_serializers.pyfromdjango.testimportTestCasefrom..serializersimportCompanySerializerfrom.factoriesimportCompanyFactoryclassCompanySerializer(TestCase):deftest_model_fields(self):"""Serializer data matches the Company object for each field."""company=CompanyFactory()forfield_namein['id','name','description','website','street_line_1','street_line_2','city','state','zipcode']:self.assertEqual(serializer.data[field_name],getattr(company,field_name))
Part 3: View
The view is the layer in which we hook up a URL to a queryset, and a serializer for each object in the queryset. Django Rest Framework again provides helpful objects that we can use to define our view. Since we want to create an API endpoint for reading, creating, and updating Company objects, we can use Django Rest Framework mixins for such actions. Django Rest Framework does provide a ModelViewSet which by default allows handling of POST, PUT, PATCH, and DELETE requests, but since we don’t need to handle DELETE requests, we can use the relevant mixins for each of the actions we need:
# views.pyfromrest_framework.mixinsimport(CreateModelMixin,ListModelMixin,RetrieveModelMixin,UpdateModelMixin)fromrest_framework.viewsetsimportGenericViewSetfrom.modelsimportCompanyfrom.serializersimportCompanySerializerclassCompanyViewSet(GenericViewSet,# generic view functionalityCreateModelMixin,# handles POSTsRetrieveModelMixin,# handles GETs for 1 CompanyUpdateModelMixin,# handles PUTs and PATCHesListModelMixin):# handles GETs for many Companiesserializer_class=CompanySerializerqueryset=Company.objects.all()
Now we have an API endpoint that allows making GET, POST, PUT, and PATCH requests to read, create, and update Company objects. In order to make sure it works just as we expect, we add some tests:
# tests/test_views.pyfromdjango.testimportTestCasefromdjango.urlsimportreversefromrest_frameworkimportstatusfrom.factoriesimportCompanyFactory,UserFactoryclassCompanyViewSetTestCase(TestCase):defsetUp(self):self.user=UserFactory(email='testuser@example.com')self.user.set_password('testpassword')self.user.save()self.client.login(email=self.user.email,password='testpassword')self.list_url=reverse('company-list')defget_detail_url(self,company_id):returnreverse(self.company-detail,kwargs={'id':company_id})deftest_get_list(self):"""GET the list page of Companies."""companies=[CompanyFactory()foriinrange(0,3)]response=self.client.get(self.list_url)self.assertEqual(response.status_code,status.HTTP_200_OK)self.assertEqual(set(company['id']forcompanyinresponse.data['results']),set(company.idforcompanyincompanies))deftest_get_detail(self):"""GET a detail page for a Company."""company=CompanyFactory()response=self.client.get(self.get_detail_url(company.id))self.assertEqual(response.status_code,status.HTTP_200_OK)self.assertEqual(response.data['name'],company.name)deftest_post(self):"""POST to create a Company."""data={'name':'New name','description':'New description','street_line_1':'New street_line_1','city':'New City','state':'NY','zipcode':'12345',}self.assertEqual(Company.objects.count(),0)response=self.client.post(self.list_url,data=data)self.assertEqual(response.status_code,status.HTTP_201_CREATED)self.assertEqual(Company.objects.count(),1)company=Company.objects.all().first()forfield_nameindata.keys():self.assertEqual(getattr(company,field_name),data[field_name])deftest_put(self):"""PUT to update a Company."""company=CompanyFactory()data={'name':'New name','description':'New description','street_line_1':'New street_line_1','city':'New City','state':'NY','zipcode':'12345',}response=self.client.put(self.get_detail_url(company.id),data=data)self.assertEqual(response.status_code,status.HTTP_200_OK)# The object has really been updatedcompany.refresh_from_db()forfield_nameindata.keys():self.assertEqual(getattr(company,field_name),data[field_name])deftest_patch(self):"""PATCH to update a Company."""company=CompanyFactory()data={'name':'New name'}response=self.client.patch(self.get_detail_url(company.id),data=data)self.assertEqual(response.status_code,status.HTTP_200_OK)# The object has really been updatedcompany.refresh_from_db()self.assertEqual(company.name,data['name'])deftest_delete(self):"""DELETEing is not implemented."""company=CompanyFactory()response=self.client.delete(self.get_detail_url(company.id))self.assertEqual(response.status_code,status.HTTP_405_METHOD_NOT_ALLOWED)
As the app becomes more complicated, we add more functionality (and more tests) to handle things like permissions and required fields. For a quick way to limit permissions to authenticated users, we add the following to our settings file:
And add a test that only permissioned users can access the endpoint:
# tests/test_views.pyfromdjango.testimportTestCasefromdjango.urlsimportreversefromrest_frameworkimportstatusfrom.factoriesimportCompanyFactory,UserFactoryclassCompanyViewSetTestCase(TestCase):...deftest_unauthenticated(self):"""Unauthenticated users may not use the API."""self.client.logout()company=CompanyFactory()withself.subTest('GET list page'):response=self.client.get(self.list_url)self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)withself.subTest('GET detail page'):response=self.client.get(self.get_detail_url(company.id))self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)withself.subTest('PUT'):data={'name':'New name','description':'New description','street_line_1':'New street_line_1','city':'New City','state':'NY','zipcode':'12345',}response=self.client.put(self.get_detail_url(company.id),data=data)self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)# The company was not updatedcompany.refresh_from_db()self.assertNotEqual(company.name,data['name'])withself.subTest('PATCH):data={'name':'New name'}response=self.client.patch(self.get_detail_url(company.id),data=data)self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)# The company was not updatedcompany.refresh_from_db()self.assertNotEqual(company.name,data['name'])withself.subTest('POST'):data={'name':'New name','description':'New description','street_line_1':'New street_line_1','city':'New City','state':'NY','zipcode':'12345',}response=self.client.put(self.list_url,data=data)self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)withself.subTest('DELETE'):response=self.client.delete(self.get_detail_url(company.id))self.assertEqual(response.status_code,status.HTTP_403_FORBIDDEN)# The company was not deletedself.assertTrue(Company.objects.filter(id=company.id).exists())
As our project grows, we can edit these permissions, make them more specific, and continue to add more complexity, but for now, these are reasonable defaults to start with.
Conclusion
Adding an API endpoint to a project can take a considerable amount of time, but with the Django Rest Framework tools, it can be done more quickly, and be well-tested. Django Rest Framework provides helpful tools that we’ve used at Caktus to create many endpoints, so our process has become a lot more efficient, while still maintaining good coding practices. Therefore, we’ve been able to focus our efforts in other places in order to expand our abilities to grow sharp web apps.
Above: The Internet Summit in Raleigh is one of the local conferences we recommend attending. (Photo by Ian Huckabee.)
At Caktus, we strongly believe in professional development and continued learning. We encourage our talented team to stay up to date with industry trends and technologies. During 2018, Cakti attended a number of conferences around the country. Below is a list (in alphabetical order) of the ones we found the most helpful, practical, and interesting. We look forward to attending these conferences again, and if you get the chance, we highly recommend that you check them out as well.
All Things Open is a celebration and exploration of open source technology and its impact. Topics range from nuts and bolts sessions on database design and OSS languages to higher-level explorations of current trends like machine learning with Python and practical blockchain applications.
The annual conference is heavily publicized in the open source community, of which Caktus is an active member. All Things Open attracts open source thought leaders from across industries, and it’s a valuable learning experience for both non-technical newbies and expert developers to gain insight into a constantly evolving field.
Tim particularly enjoyed the session by expert Jordan Kasper of the Defence Digital Service about his efforts to implement open source development at the US Department of Defense with the code.mil project. It was an enlightening look at the use of open source in the federal government, including the challenges and opportunities involved.
Above: Hundreds of attendees at DjangoCon 2018. (Photo by Bartek Pawlik.)
DjangoCon is the preeminent conference for Django users, and as early proponents of Django, Caktus has sponsored the conference since 2009. We always enjoy the annual celebration of all things Django, featuring presentations and workshops from veteran Djangonauts and enthusiastic beginners.
David enjoyed Russell Keith-Magee’s informative and hilarious talk on navigating the complexities of time zones in computer programming. It’s rare for a conference talk to explore both the infinite complexity of a topic while also providing concrete tools to resolve the complexity as Russell did. Attending DjangoCon is a must for Django newbies and veterans alike. It’s an opportunity to sharpen your craft and explore the possibilities of our favorite framework in a fun, collaborative, and supportive environment.
The Digital PM Summit is an annual gathering of project managers, which includes presentations, workshops, and breakout sessions focused on managing digital projects. The event is organized by the Bureau of Digital and was held in Memphis in 2018. Cakti attended and spoke at the conference in previous years; check out the highlights of Marketing Content Manager Elizabeth Michalka’s talk on investing in relationships from the 2016 conference.
The summit provides a unique opportunity for attendees to network and learn from each other. Indeed, the biggest draw for Gannon was the chance to be in the same room as so many other project managers. The project management (PM) role encompasses an array of activities — planning and defining scope, activity planning and sequencing, resource planning, time and cost estimating, and reporting progress, to name a few. There are professional books and months-long certificates to teach this knowledge, but nothing is better than being able to ask, “Hey, have you run into this before?” The ability to compare notes with PMs he doesn’t work with every day is invaluable, and the three most impactful sessions for Gannon were:
Rachel Gertz’s talk “Static to Signal: A Lesson in Alignment”
Meghan McInerney’s talk on “The Ride or Die PM”
Lynn Winter’s talk “PM Burnout: The Struggle Is Real”
Each of these talks seemed to build on one another. Rachel Gertz set the stage with her keynote by pointing out that the project manager is the nexus of a project. The course of a project is determined by countless small adjustments, and the PM is the one who makes those adjustments. Nine times out of 10, when a project fails, it’s because of something the PM did or didn’t do.
Also a keynote, Meghan McInerney’s talk (pictured) identified the primary attributes of a PM who’s at the top of their game. They’re reliable, adaptable, and a strategic partner for their clients. When you hit this ideal, you’re the one asking stakeholders and team members hard questions about “Should we?” — and as the PM, you’re the only one who can be counted on to ask those questions. Lynn Winter’s lightning talk cautions against giving too much of yourself over to this, though. As she pointed out, the role is often made up of all the tasks that no one else wants to do, and there’s a good chance at least a few of those tasks will take a toll on you. You have to make space for yourself if you’re going to be effective.
Internet Summit
Recommended by Chief Business Development Officer Ian Huckabee
Next Conference Location: Raleigh
Next Conference Date: November 13 - 14, 2019
The Internet Summit is a marketing conference that attracts impressive speakers and provides opportunities to stay on top of digital marketing trends and technologies that can help drive growth and success. Internet Summit conferences are organized by Digital Summit and are also held in varies cities, in addition to Raleigh.
Digital marketing is constantly changing, so it’s important to stay current. At Internet Summit, Ian heard valuable information from dozens of the country’s top digital marketers who shared current trends and best practices for using real-time data to build intelligence and improve customer interaction and engagement. Keynote speakers included marketing guru, author, and former dot-com business exec Seth Godin and founder of The Onion Scott Dikkers.
These summits are key to staying on trend with how people want to be reached, and the execution strategies are, in most cases, proven. For instance, behavioral targeting tools have evolved to the point where ABM (account-based marketing) can be extremely effective when executed properly. Also meaningful for Ian was the talk Marketing Analytics: Get the Insights You Need Faster, by Matt Hertig. Ian walked away from the workshop with tangible advice on managing large volumes of data to provide meaningful insights and analysis more quickly.
JupyterDay in the Triangle
Recommended by Chief Technical Officer and Co-founder Colin Copeland
Next Conference Location: Chapel Hill
Next Conference Date: TBD
In 2018, Colin was excited to attended JupyterDay in the Triangle because it provided a chance to learn from the greater Python/Jupyter community, and it occurred around the corner in Chapel Hill. He especially enjoyed the following presentations:
Matthew McCormick’s talk on Interactive 3D and 2D Image Visualization for Jupyter, which demonstrated how notebooks can be used to explore very large datasets and scientific imaging data
Joan Pharr’s talk on Learning in Jupyter focused on how her company uses Jupyter notebooks for SME Training and onboarding new members to their team
Colin’s favorite presentation was Have Yourself a Merry Little Notebook by Ginny Gezzo of IBM (pictured). She focused on using Python notebooks to solve https://adventofcode.com/ puzzles. The talk highlighted the value of notebooks as a tool for exploring and experimenting with problems that you don’t know how to solve. Researching solutions can easily lead in many directions, and it’s valuable to have a medium to record what you did and how you got there, and you can easily share your results with other team members.
Colin has worked with Jupyter notebooks, and sees great value in their utility. For example, Caktus used them on a project with Open Data Policing to inspect merging multiple sheets of an Excel workbook in Pandas (see the project in github).
Above: The Caktus team and booth during PyCon 2018.
Cakti regularly attend and sponsor PyCon. It’s the largest annual gathering of Python users and developers. The next conference will take place in Cleveland, OH, in May 2019.
The event also includes an “unconference” that runs in parallel of the scheduled talks. Scott especially enjoyed these open sessions during the 2018 conference. Pycon dedicates space to open sessions every year. These are informal, community-driven meetings where anybody can post a topic, with a time and place to gather. The open sessions that Scott attended covered a wide range of topics from using Python to control CNC Milling machines to how reporters can use encryption to protect sources. He also enjoyed a session on Site Reliability Engineering (SRE), which included professionals from Intel, Google, and Facebook who spoke about how they managed infrastructure at scale.
TestBash was so good in 2017, that Gerald decided to attend again in 2018. The conferences are organized nationally and internationally by the Ministry of Testing, and in 2018, Gerald attended the event in San Francisco. Gerald originally learned about the event on a QA Testing forum called The Club. Read about what he loved at the 2017 conference.
TestBash is a single-track conference providing a variety of talks that cover all areas of testing. Talks range from topics such as manual testing, automation, and machine learning, to less technical topics including work culture and quality.
One particularly interesting talk was given by Paul Grizzaffi, a Principal Automation Architect at Magenic. He declared automation development is like software development. Paul talked about how the same principles used when developing features for a website can also be used when building out the automation that tests said website. Just like code is written to build a website, code is also written to create the automated scripts that run to test the website, therefore there is a valid argument for treating automation as software development. The talk highlighted the point that sometimes automation is seen as an extra tool, but it’s actually something that we build to perform a task. So when we think about it, it’s not that different from the development process one would go through when developing a new website. Paul’s talk is available on The Dojo (with a free membership), and you can read more on his blog.
TestBash provides practical information that attendees can learn and take back to their teams to implement. Attendees not only learn from the speakers but also from each other by sharing their challenges and how they overcame them. It’s also a positive environment for networking and building friendships. Gerald met people at the conference who he’s stayed in touch with and who provide a lasting sounding board.
Worth Going Again
We recommend these conferences and look forward to attending them again because they provide such valuable learning and networking opportunities.
What conferences do you plan to attend this year? If you have any recommendations, please leave them in the comments below.
Pictured from left: Our musically inclined Cakti, Dane Summers, Dan Poirier, and Ian Huckabee.
The first installment of the secret lives of Cakti highlighted some colorful extracurriculars (including rescuing cats, running endurance events, and saving lives). This time, we’re taking a look at our team’s unexpected musical talents.
If you Google musicians and programming, you’ll find dozens of posts exploring the correlation between musical talent and programming expertise. Possible factors include musicians’ natural attention to detail, their trained balance between analysis and creativity, and their comfort with both solitary focus and close collaboration.
Cakti are no exception to this, and creative talent runs deep across our team. Here are a few of our musical colleagues.
Appalachian Picker Dane Summers
Contract programmer Dane is inspired by old-time Appalachian music as both a banjo player and flat foot clogger. After ten years of learning to play, he’s managed to accumulate four banjos, but his favorite (and the only currently-functional one) is a fretless that he plays in the traditional Round Peak style. He’s working up to singing while he plays, at which point we hope he'll do an in-office concert.
The Multi-Talented Dan Poirier
Our sharp developer Dan has multiple musical passions. As a singer, he lends his baritone to Voices, a distinguished community chorus in Chapel Hill. You can also hear him a couple of times a month on WCPE, the Classical Station, as an announcer for Weekend Classics. Rumor has it that he’s also a dab hand at the ukulele, though until he shows off his talents at the office we won’t know for sure.
Blues Guitarist Ian Huckabee
Holding the distinction as the only Caktus team member to jam with Harry Connick, Jr., our chief business development officer Ian has played blues guitar since he was 12 years old. He also started his professional career in the music business, managing the NYC recording studios for Sony Music Entertainment. His current musical challenge is mastering Stevie Ray Vaughan’s cover of Little Wing.
Waiting for the Band to Get Together
No word yet on whether Dan, Dane, and Ian are planning to start a Caktus band, but we’ll keep you posted. If they do, they’ll have more talent to draw from: our team also includes an opera singer, multiple guitarists, a fiddle player, and others.
It's been awhile since we last discussed bulk inserts on the Caktus blog. The idea is simple: if you have an application that needs to insert a lot of data into a Django model — for example a background task that processes a CSV file (or some other text file) — it pays to "chunk" those updates to the database so that multiple records are created through a single database operation. This reduces the total number of round-trips to the database, something my colleague Dan Poirier discussed in more detail in the post linked above.
Today, we use Django's Model.objects.bulk_create() regularly to help speed up operations that insert a lot of data into a database. One of those projects involves processing a spreadsheet with multiple tabs, each of which might contain thousands or even tens of thousands of records, some of which might correspond to multiple model classes. We also need to validate the data in the spreadsheet and return errors to the user as quickly as possible, so structuring the process efficiently helps to improve the overall user experience.
While it's great to have support for bulk inserts directly in Django's ORM, the ORM does not provide much assistance in terms of managing the bulk insertion process itself. One common pattern we found ourselves using for bulk insertions was to:
build up a list of objects
when the list got to a certain size, call bulk_create()
make sure any objects remaining (i.e., which might be fewer than the chunk size of prior calls to bulk_create()) are inserted as well
Since for this particular project we needed to repeat the same logic for a number of different models in a number of different places, it made sense to abstract that into a single class to handle all of our bulk insertions. The API we were looking for was relatively straightforward:
Set bulk_mgr = BulkCreateManager(chunk_size=100) to create an instance of our bulk insertion helper with a specific chunk size (the number of objects that should be inserted in a single query)
Call bulk_mgr.add(unsaved_model_object) for each model instance we needed to insert. The underlying logic should determine if/when a "chunk" of objects should be created and does so, without the need for the surrounding code to know what's happening. Additionally, it should handle objects from any model class transparently, without the need for the calling code to maintain separate object lists for each model.
Call bulk_mgr.done() after adding all the model objects, to insert any objects that may have been queued for insertion but not yet inserted.
Without further ado, here's a copy of the helper class we came up with for this particular project:
fromcollectionsimportdefaultdictfromdjango.appsimportappsclassBulkCreateManager(object):""" This helper class keeps track of ORM objects to be created for multiple model classes, and automatically creates those objects with `bulk_create` when the number of objects accumulated for a given model class exceeds `chunk_size`. Upon completion of the loop that's `add()`ing objects, the developer must call `done()` to ensure the final set of objects is created for all models. """def__init__(self,chunk_size=100):self._create_queues=defaultdict(list)self.chunk_size=chunk_sizedef_commit(self,model_class):model_key=model_class._meta.labelmodel_class.objects.bulk_create(self._create_queues[model_key])self._create_queues[model_key]=[]defadd(self,obj):""" Add an object to the queue to be created, and call bulk_create if we have enough objs. """model_class=type(obj)model_key=model_class._meta.labelself._create_queues[model_key].append(obj)iflen(self._create_queues[model_key])>=self.chunk_size:self._commit(model_class)defdone(self):""" Always call this upon completion to make sure the final partial chunk is saved. """formodel_name,objsinself._create_queues.items():iflen(objs)>0:self._commit(apps.get_model(model_name))
I tried to simplify the code here as much as possible for the purposes of this example, but you can obviously expand this as needed to handle multiple model classes and more complex business logic. You could also potentially put bulk_mgr.done() in its own finally: or except ExceptionType: block, however, you should be careful not to write to the database again if the original exception is database-related.
Another useful pattern might be to design this as a context manager in Python. We haven't tried that yet, but you might want to.
Good luck with speeding up your Django model inserts, and feel free to post below with any questions or comments!
In 2018, we published 44 posts on our blog, including technical how-to’s, a series on UX research methods, web development best practices, and tips for project management. Among all those posts, 18 rose to the top of the popularity list in 2018.
Make ALL Your Django Forms Better: This post also focuses on Django forms. Learn how to efficiently build consistent forms, across an entire website.
Django vs WordPress: How to decide?: Once you invest in a content management platform, the cost to switch later may be high. Learn about the differences between Django and WordPress, and see which one best fits your needs.
Basics of Django Rest Framework: Django Rest Framework is a library which helps you build flexible APIs for your project. Learn how to use it, with this intro post.
How to Fix your Python Code's Style: When you inherit code that doesn’t follow your style preferences, fix it quickly with the instructions in this post.
Filtering and Pagination with Django: Learn to build a list page that allows filtering and pagination by enhancing Django with tools like django_filter.
Better Python Dependency Management with pip-tools: One of our developers looked into using pip-tools to improve his workflow around projects' Python dependencies. See what he learned with pip-tools version 2.0.2.
Types of UX Research: User-centered research is an important part of design and development. In this first post in the UX research series, we dive into the different types of research and when to use each one.
Avoiding the Blame Game in Scrum: The words we use, and the tone in which we use them, can either nurture or hinder the growth of Scrum teams. Learn about the importance of communicating without placing blame.
What is Software Quality Assurance?: A crucial but often overlooked aspect of software development is quality assurance. Find out more about its value and why it should be part of your development process.
Quick Tips: How to Find Your Project ID in JIRA Cloud: Have you ever created a filter in JIRA full of project names and returned to edit it, only to find all the project names replaced by five-digit numbers with no context? Learn how to find the project in both the old and new JIRA experience.
UX Research Methods 3: Evaluating What Is: One set of techniques included in UX research involves evaluating the landscape and specific instances of existing user experience. Learn more about competitive landscape review.
5 Scrum Master Lessons Learned: Whether your team is new to Scrum or not, check out these lessons learned. Some are practical, some are abstract, and some are helpful reminders like “Stop being resistant to change, let yourself be flexible."
Add Value To Your Django Project With An API: This post for business users and beginning coders outlines what an API is and how it can add value to your web development project.
Caktus Blog: Best of 2017: How appropriate that the last post in this list is about our most popular posts from the previous year! So, when you’ve read the posts above, check out our best posts from 2017.
Thank You for Reading Our Blog
We look forward to giving you more content in 2019, and we welcome any questions, suggestions, or feedback. Simply leave a comment below.
You may look at my job title (or picture) and think, “Oh, this is easy, he’s going to resolve to stand up at his desk more.” Well, you’re not wrong, that is one of my resolutions, but I have an even more important
one. I, Jeremy Gibson, resolve to do less work in 2019. You’re probably thinking that it’s bold to admit this on my employer’s blog. Again, you’re not wrong, but I think I can convince them that the less work I
do, the more clear and functional my code will become. My resolution has three components.
1) I will stop using os.path to do path manipulations and will only
use pathlib.Path on any project that uses Python 3.4+
I acknowledge that pathlib is better than me at keeping operating system eccentricities in mind. It is also better at keeping my code DRYer
and more readable. I will not fight that.
Let's take a look at an example that is very close to parity. First, a
simple case using os.path and pathlib.
# Opening a file the with os.pathimportosp='my_file.txt'ifnotos.path.exists(pn):open(pn,'a')withopen(pn)asfh:# Manipulate
Next, pathlib.Path
# Opening a file with PathfrompathlibimportPathp=Path("my_file.txt")ifnotp.exists():p.touch()withp.open()asfh:# Manipulate
This seems like a minor improvement, if any at all, but hear me out. The
pathlib version is more internally consistent. Pathlib sticks to its own
idiom, whereas os.path must step outside of itself to accomplish path
related tasks like file creation. While this might seem minor, not
having to code switch to accomplish a task can be a big help for new
developers and veterans, too.
Not convinced by the previous example? Here’s a more complex example of
path work that you might typically run across during development —
validating a set of files in one location and then moving them to
another location, while making the code workable over different
operating systems.
Note: with pathlib I don't have to worry about os.sep() Less work!
More readable!
Also, as in the first example, all path manipulation and control is now
contained within the library, so there is no need to pull in outside
os functions or shutil modules. To me, this is more satisfying. When
working with paths, it makes sense to work with one type of object that
understands itself as a path, rather than different collections of
functions nested in other modules.
Ultimately, for me, this is a more human way to think about the
processes that I am manipulating. Thus making it easier and less work.
Yaay!
2) I will start using
f'' strings
on Python 3.6+ projects.
I acknowledge that adding .format() is a waste of precious line
characters (I'm looking at you PEP 8) and % notation is unreadable.
The f'' string makes my code more elegant and easier to read. They
also move closer to the other idioms used by python like r'' and b''
and the no longer necessary (if you are on Python3) u''. Yes, this is
a small thing, but less work is the goal.
for k, v in somedict.items():
print("The key is {}\n The value is {}'.format(k, v))
vs.
for k, v in somedict.items():
print(f'The key is {k}\n The value is {v}')
Another advantage in readability and maintainability is that I don't
have to keep track of parameter position as before with .format(k, v)
if I later decide that I really want v before k.
3) I will work toward, as much as possible, writing my tests before I
write my code.
I acknowledge that I am bad about jumping into a problem, trying to
solve it before I fully understand the behavior I want to see (don't
judge me, I know some of you do this, too). I hope, foolishly, that the
behavior will reveal itself as I solve the various problems that crop
up.
Writing your tests first may seem unintuitive, but hear me out. This is
known as Test Driven
Development.
Rediscovered by Kent Beck in 2003, it is a programming methodology that
seeks to tackle the problem of managing code complexity with our puny
human brains.
The basic concept is simple: to understand how to build your program you
must understand how it will fail. So, the first thing that you should do
is write tests for the behaviors of your program. These tests will fail
and that is good because now you (the programmer with the puny human
brain) have a map for your code. As you make each test pass, you will
quickly know if the code doesn’t play well with other parts of the code,
causing the other tests to fail.
This idea is closely related to Acceptance Test Driven
Development, which you may have also heard of, and is mentioned in this Caktus post.
It All Adds Up
Although these three parts of my resolution are not huge, together they
will allow me to work less. Initially, as I write the code, and then in
the future when I come back to code I wrote two sprints ago and is now a
mystery to me.
So that's it, I'm working less next year, and that will make my code
better.
Sometimes we inherit code that doesn't follow the style guidelines
we prefer when we're writing new code. We could just run
flake8 on
the whole codebase and fix everything before we continue, but that's not
necessarily the best use of our time.
Another approach is to update the styling of files when we need to make
other changes to them. To do that, it's helpful to be able to run a code style
checker on just the files we're changing. I've written tools to do that for
various source control systems and languages over the years. Here's the one I'm
currently using for Python and flake8.
I call this script flake. I have a key in my IDE bound to run it and show
the output so I can click on each line to go to the code that
has the problem, which makes it pretty easy to fix things.
It can run in two modes. By default, it checks any files that have uncommitted
changes. Or I can pass it the name of a git branch, and it checks all files
that have changes compared to that branch. That works well when I'm working
on a feature branch that is several commits downstream from develop and I
want to be sure all the files I've changed while working on the feature are
now styled properly.
The script is in Python, of course.
Work from the repository root
Since we're going to work with file paths output from git commands, it's
simplest if we first make sure we're in the root directory of the repository.
#!/usr/bin/env python3importosimportos.pathimportsubprocessifnotos.path.isdir('.git'):print("Working dir: %s"%os.getcwd())result=subprocess.run(['git','rev-parse','--show-toplevel'],stdout=subprocess.PIPE)dir=result.stdout.rstrip(b'\n')os.chdir(dir)print("Changed to %s"%dir)
We use git rev-parse --show-toplevel to find out what the top directory in
the repository working tree is, then change to it. But first we check for
a .git directory, which tells us we don't need to change directories.
Find files changed from a branch
If a branch name is passed on the command line, we want to identify the Python
files that have changed compared to that branch.
importsys...iflen(sys.argv)>1:# Run against files that are different from *branch_name*branch_name=sys.argv[1]cmd=["git","diff","--name-status",branch_name,"--"]out=subprocess.check_output(cmd).decode('utf-8')changed=[# "M\tfilename"line[2:]forlineinout.splitlines()ifline.endswith(".py")and"migrations"notinlineandline[0]!='D']
We use git diff --name-status <branch-name> -- to list the changes compared
to the branch. We skip file deletions — that means we no longer have a file to
check — and migrations, which never seem to quite be PEP-8 compliant and which
I've decided aren't worth trying to fix. (You may decide differently, of
course.)
Find files with uncommitted changes
Alternatively, we just look at the files that have uncommitted changes.
else:# See what files have uncommitted changescmd=["git","status","--porcelain","--untracked=no"]out=subprocess.check_output(cmd).decode('utf-8')changed=[]forlineinout.splitlines():if"migrations"inline:# Auto-generated migrations are rarely PEP-8 compliant. It's a losing# battle to always fix them.continueifline.endswith('.py'):if'->'inline:# A file was renamed. Consider the new name changed.parts=line.split(' -> ')changed.append(parts[1])elifline[0]=='M'orline[1]!=' ':changed.append(line[3:])
Here we take advantage of git --porcelain to ensure the output won't
change from one git version to the next, and it's fairly easy to parse in
a script. (Maybe I should investigate using --porcelain with the other
git commands in the script, but what I have now works well enough.)
Run flake8 on the changed files
Either way, changed now has a list of the files we want to run flake8 on.
Running flake8 with subprocess.call this way sends the output to stdout
so we can see it. flake8 will exit with a non-zero status if there are problems;
we print a message and also exit with a non-zero status.
Wrapping up
I might have once written a script like this in Shell or Perl, but
Python turns out to work quite well once you get a handle on the
subprocess module.
The resulting script is useful for me. I hope you'll find parts of it
useful too, or at least see something you can steal for your own scripts.
‘Tis the season for shopping online, sending cute holiday e-cards, and emailing photos to grandparents. But during all this festive online activity, how much do you think about your computer security? For example, is your password different for every shopping and e-card site that you use? If not, it should be!
Given that Friday, November 30, is Computer Security Day, it’s a good time to consider whether your online holiday habits are putting you at risk of a data breach. And our top tip is to use a different password for every website and online account. You’ve probably heard this a hundred times already, but it’s the first line of defense that you have against attacks.
We all should take computer and internet security seriously. The biggest threat to ordinary users is password reuse, like having the same (or similar) username and password combination for Amazon, Facebook, and your health insurance website. This issue is frighteningly common — the resource Have I Been Pwned has collected 5.6 billion username and password pairs since 2013. Once attackers breach one of your online accounts, they then try the same username and password on sites across the internet, looking for another match.
If one password on one website is breached, then all your other accounts with the same password are vulnerable.
It’s worth reiterating: Don’t use the same password on more than one website. Otherwise, your accounts are an easy target for an attacker to gain valuable data like your credit card number and go on a holiday shopping spree that’ll give you a headache worse than any eggnog hangover you’ve ever had!
More Tips to Fend Off an Online Grinch
Here are a few more tips for password security, to help protect your personal information from attacks, scams, phishing, and other unsavory Grinch-like activity:
Create a strong password for every website and online account. A password manager like LastPass or 1Password can help you create unique passwords for every online account. Be sure to also choose a strong passphrase with 2-factor authentication for your password manager login, and then set it up to automatically generate passwords for you.
Choose 2-factor authentication. Many websites now offer some means of 2-factor authentication. It takes a few more minutes to set up, but it’s worth it. Do this on as many websites as possible to make your logins more secure.
Do not send personal or business-related passwords via email. It may be an easy means of communication, but email is not a secure method of communication.
Have Holly, Jolly Holidays
You have an online footprint consisting of various accounts, email providers, social media, and web browsing history. Essential personal info, like your health records, banking and credit records are online, too. All of this info is valuable and sellable to someone, and the tools they use to steal your data are cheap. All they need to do is get one credit card number and the payoff may be huge. Don’t let that credit card number be yours, otherwise, you won’t have a very jolly holiday.
Be vigilant, especially around the holidays, when there’s an increase in online commerce and communication, and therefore a greater chance that an attacker may succeed in getting the info they want from you.
DjangoCon 2018 attracted attendees from around the world, including myself and several other Cakti (check out our DjangoCon recap post). Having attended a number of DjangoCons in the past, I looked forward to reconnecting with old colleagues and friends within the community, learning new things about our favorite framework, and exploring San Diego.
While it was a privilege to attend DjangoCon in person, you can experience it remotely. Thanks to technology and the motivated organizers, you can view a lot of the talks online. For that, I am thankful to the DjangoCon organizers, sponsors, and staff that put in the time and energy to ensure that these talks are readily available for viewing on YouTube.
Learn How to Give Back to the Django Framework
While I listened to a lot of fascinating talks, there was one that stood out and was the most impactful to me. I also think it is relevant and important for the whole Django community. If you have not seen it, I encourage you to watch and rewatch Carlton Gibson’s “Your web framework needs you!". Carlton was named a Django Fellow in January of 2018 and provides a unique perspective on the state of Django as an open source software project, from the day-to-day management, to the (lack of) diversity amongst the primary contributors, to the ways that people can contribute at the code and documentation levels.
This talk resonated with me because I have worked with open source software my entire career. It has enabled me to bootstrap and build elegant solutions with minimal resources. Django and its ilk have afforded me opportunities to travel the globe and engage with amazing people. However, in over 15 years of experience, my contributions back to the software and communities that have served me well have been nominal in comparison to the benefits I have received. But I came away from the talk highly motivated to contribute more, and am eager to get that ball rolling.
Carlton says in his talk, “we have an opportunity to build the future of Django here.” He’s right, our web framework needs us, and via his talk you will discover how to get involved in the process, as well as what improvements are being made to simplify onboarding. I agree with Carlton, and believe it’s imperative to widen the net of contributors by creating multiple avenues for contributions that are easily accessible and well supported. Contributions are key to ensuring a sound future for the Django framework. Whether it’s improving documentation, increasing test coverage, fixing bugs, building new features, or some other aspect that piques your interest, be sure to do your part for your framework. The time that I am able to put toward contributing to open source software has always supplied an exponential return, so give it a try yourself!
Watch the talk to see how you can contribute to the Django framework.
Above: Hundreds of happy Djangonauts at DjangoCon 2018. (Photo by Bartek Pawlik.)
That’s it, folks — another DjangoCon in the books! Caktus was thrilled to sponsor and attend this fantastic gathering of Djangonauts for the ninth year running. This year’s conference ran from October 14 - 19, in sunny San Diego. ☀️
Our talented Caktus contractor Erin Mullaney was a core member of this year’s DjangoCon organizing team, plus five more Cakti joined as participants: CTO Colin Copeland, technical manager Karen Tracey, sales engineer David Ray, CBDO Ian Huckabee, and myself, account exec Tim Scales.
What a Crowd!
At Caktus we love coding with Django, but what makes Django particularly special is the remarkable community behind it. From the inclusive code of conduct to the friendly smiles in the hallways, DjangoCon is a welcoming event and a great opportunity to meet and learn from amazing people. With over 300 Django experts and enthusiasts attending from all over the world, we loved catching up with old friends and making new ones.
#Djangocon 2018 in San Diego has been the most inclusive, breath of fresh air conference I've ever attended, with the most beautiful and diverse group of people. Way to go Team Djangocon, @FlipperPA !!! <3
DjangoCon is three full days of impressive and inspiring sessions from a diverse lineup of presenters. Between the five Cakti there, we managed to attend almost every one of the presentations.
We particularly enjoyed Anna Makarudze’s keynote address about her journey with coding, Russell Keith-Magee’s hilarious talk about tackling time zone complexity, and Tom Dyson’s interactive presentation about Django and Machine Learning. (Videos of the talks should be posted soon by DjangoCon.)
Thanks to the 30+ Djangonauts who joined us for the Caktus Mini Golf Outing on Tuesday, October 16! Seven teams putted their way through the challenging course at Belmont Park, talking Django and showing off their mini golf skills. We had fun meeting new friends and playing a round during the beautiful San Diego evening.
Mini golf was a blast! An ending worthy of the Masters: a 2-way tie until the very end when — drum roll — @GrahamDumpleton, last player on the course (pictured with caddies) won by a single stroke! Congrats! And thanks to everyone who came out. #DjangoConpic.twitter.com/S79MoAr9zt
If you want to build a list page that allows
filtering and pagination, you have to get a few separate things
to work together. Django provides some tools for pagination,
but the documentation doesn't tell us how to make that work with anything else.
Similarly, django_filter makes it relatively easy to add filters to
a view, but doesn't tell you how to add pagination (or other things)
without breaking the filtering.
The heart of the problem is that both features use query parameters,
and we need to find a way to let each feature control its own query
parameters without breaking the other one.
Filters
Let's start with a review of filtering, with an example of how you might
subclass ListView to add filtering. To make it filter the way you want,
you need to create a subclass of
FilterSet
and set filterset_class to that class. (See that link for how to
write a filterset.)
classFilteredListView(ListView):filterset_class=Nonedefget_queryset(self):# Get the queryset however you usually would. For example:queryset=super().get_queryset()# Then use the query parameters and the queryset to# instantiate a filterset and save it as an attribute# on the view instance for later.self.filterset=self.filterset_class(self.request.GET,queryset=queryset)# Return the filtered querysetreturnself.filterset.qs.distinct()defget_context_data(self,**kwargs):context=super().get_context_data(**kwargs)# Pass the filterset to the template - it provides the form.context['filterset']=self.filtersetreturncontext
Here's an example of how you might create a concrete view to use it:
filterset.form is a form that controls the filtering, so
we just render that however we want and add a way to submit it.
That's all you need to make a simple filtered view.
Default values for filters
I'm going to digress slightly here, and show a way to give filters default
values, so when a user loads a page initially, for example, the items will
be sorted with the most recent first. I couldn't find anything about this in the
django_filter documentation, and it took me a while to figure out a good
solution.
To do this, I override __init__ on my filter set and add default values
to the data being passed:
Once paginate_by is set to the number of items you want per page,
object_list will contain only the items on the current page,
and there will be some additional items in the context:
We need to update the template so the user can control the pages.
Let's start our template updates by just telling the user where we are:
{%ifis_paginated%}Page {{page_obj.number}} of {{paginator.num_pages}}{%endif%}
To tell the view which page to display, we want to add a query parameter
named page whose value is a page number. In the simplest case, we can
just make a link with ?page=N, e.g.:
<ahref="?page=2">Goto page 2</a>
You can use the page_obj and paginator objects to build a full set
of pagination links, but there's a problem we should solve first.
Combining filtering and pagination
Unfortunately, linking to pages as described above breaks filtering. More specifically,
whenever you follow one of those links, the view will forget whatever filtering
the user has applied, because that filtering is also controlled by query
parameters, and these links don't include the filter's parameters.
So if you're on a page
https://example.com/objectlist/?type=paperback
and then follow a page link, you'll end up at
https://example.com/objectlist/?page=3
when you wanted to be at
https://example.com/objectlist/?type=paperback&page=3.
It would be nice if Django helped out with a way to build links that set
one query parameter without losing the existing ones, but I found a
nice example of a template tag
on StackOverflow
and modified it slightly into this custom template tag that helps
with that:
# <app>/templatetags/my_tags.pyfromdjangoimporttemplateregister=template.Library()@register.simple_tag(takes_context=True)defparam_replace(context,**kwargs):""" Return encoded URL parameters that are the same as the current request's parameters, only with the specified GET parameters added or changed. It also removes any empty parameters to keep things neat, so you can remove a parm by setting it to ``""``. For example, if you're on the page ``/things/?with_frosting=true&page=5``, then <a href="/things/?{% param_replace page=3 %}">Page 3</a> would expand to <a href="/things/?with_frosting=true&page=3">Page 3</a> Based on https://stackoverflow.com/questions/22734695/next-and-before-links-for-a-django-paginated-query/22735278#22735278 """d=context['request'].GET.copy()fork,vinkwargs.items():d[k]=vforkin[kfork,vind.items()ifnotv]:deld[k]returnd.urlencode()
Here's how you can use that template tag to build pagination links
that preserve other query parameters used for things like filtering:
Now, if you're on a page like https://example.com/objectlist/?type=paperback&page=3,
the links will look like ?type=paperback&page=2, ?type=paperback&page=4, etc.
Pictured from left: Caktus team members Vinod Kurup, Karen Tracey, and David Ray.
The Caktus team includes expert developers, sharp project managers, and eagle-eyed QA analysts. However, you may not know that there’s more to them than meets the eye. Here’s a peek at how Cakti spend their off-hours.
Vinod Kurup, M.D.
By day Vinod is a mild-mannered developer, but at night he swaps his keyboard for a stethoscope and heads to the hospital. Vinod’s first career was in medicine, and prior to Caktus he worked many years as an MD. While he’s now turned his expertise to programming, he still works part-time as a hospitalist. Now that’s what I call a side hustle.
Karen Tracey, Cat Rescuer
When Karen isn’t busy as both lead developer and technical manager for Caktus, she works extensively with Alley Cats and Angels, a local cat rescue organization dedicated to improving the lives and reducing the population of homeless cats in the Triangle area. She regularly fosters cats and kittens, which is why you sometimes find feline friends hanging out in the Caktus office.
David Ray, Extreme Athlete
Software development and extreme physical endurance training don’t generally go together, but let me introduce you to developer/sales engineer David. When not building solutions for Caktus clients, David straps on a 50-pound pack and completes 24-hour rucking events. Needless to say, he’s one tough Caktus. (Would you believe he’s also a trained opera singer?)
Pictured: David Ray at a recent rucking event.
These are just a few of our illustrious colleagues! Our team also boasts folk musicians, theater artists, sailboat captains, Appalachian cloggers, martial artists, and more.
If you’re building or updating a website, you’re probably wondering about which content management system (CMS) to use. A CMS helps users — particularly non-technical users — to add pages and blog posts, embed videos and images, and incorporate other content into their site.
CMS options
You could go with something quick and do-it-yourself, like WordPress (read more about WordPress) or a drag-and-drop builder like Squarespace. If you need greater functionality, like user account management or asset tracking, or if you’re concerned about security and extensibility, you’ll need a more robust CMS. That means using a framework to build a complex website that can manage large volumes of data and content.
Wait, what’s a framework?
Put simply, a framework is a library of reusable code that is easily edited by a web developer to produce custom products more quickly than coding everything from scratch.
Django and Drupal are both frameworks with dedicated functionality for content management, but there is a key difference between them:
Drupal combines aspects of a web application framework with aspects of a CMS
Django separates the framework and the CMS
The separation that Django provides makes it easier for content managers to use the CMS because they don’t have to tinker with the technical aspects of the framework. A popular combination is Django and Wagtail, which is our favorite CMS.
I think I’ve heard of Drupal ...
Drupal is open source and built with PHP programming language. For some applications, its customizable templates and quick integrations make it a solid choice. It’s commonly used in higher education settings, among others.
However, Drupal’s predefined templates and plugins can also be its weakness: while they are useful for building a basic site, they are limiting if you want to scale the application. You’ll quickly run into challenges attempting to extend the basic functionality, including adding custom integrations and nonstandard data models.
Other criticisms include:
Poor backwards compatibility, particularly for versions earlier than Drupal 7. In this case, updating a Drupal site requires developers to rewrite code for elements of the templates and modules to make them compatible with the newest version. Staying up-to-date is important for security reasons, which can become problematic if the updates are put off too long.
Unit testing is difficult due to Drupal’s method of storing configurations in a database, making it difficult to test the effects of changes to sections of the code. Failing to do proper testing may allow errors to make it to the final version of the website.
Another database-related challenge lies in how the site configuration is managed. If you’re trying to implement changes on a large website consisting of thousands of individual content items or users, none of the things that usually make this easier — like the ability to view line-by-line site configuration changes during code review — are possible.
What does the above mean for non-technical stakeholders? Development processes are slowed down significantly because developers have to pass massive database files back and forth with low visibility into the changes made by other team members. It also means there is an increased likelihood that errors will reach the public version of your website, creating even more work to fix them.
Caktus prefers Django
Django is used by complex, high-profile websites, including Instagram, Pinterest, and Eventbrite. It’s written in the powerful, open-source Python programming language, which was created specifically to speed up the process of web development. It’s fast, secure, scalable, and intended for use with database-driven web applications.
A huge benefit of Django is more control over customization, plus data can easily be converted. Since it’s built on Python, Django uses a paradigm called object-oriented programming, which makes it easier to manage and manipulate data, troubleshoot errors, and re-use code. It’s also easier for developers to see where changes have been made in the code, simplifying the process of updating the application after it goes live.
How to choose the right tool
Consider the following factors when choosing between Drupal and Django:
Need for customization
Internal capacity
Planning for future updates
Need for customization:
If your organization has specific, niche features or functionality that require custom development — for example, data types specific to a library, university, or scientific application — Django is the way to go. It requires more up-front development than template-driven Drupal but allows greater flexibility and customization. Drupal is a good choice if you’re happy to use templates to build your website and don’t need customization.
Internal capacity:
Drupal’s steep learning curve means that it may take some time for content managers to get up to speed. In comparison, we’ve run training workshops that get content management teams up and running on Django-based Wagtail in only a day or two. Wagtail’s intuitive user interface makes it easier to manage regular content updates, and the level of customization afforded by Django means the user interface can be developed in a way that feels intuitive to users.
Planning for future updates:
Future growth and development should be taken into account when planning a web project. The choices made during the initial project phase will impact the time, expense, and difficulty of future development. As mentioned, Drupal has backwards compatibility challenges, and therefore a web project envisioned as fast-paced and open to frequent updates will benefit from a custom Django solution.
Need a second opinion?
Don’t just take our word for it. Here’s what Brad Busenius at the University of Chicago says about their Django solution:
"[It impacts] the speed and ease at which we can create highly custom interfaces, page types, etc. Instead of trying to bend a general system like Drupal to fit our specific needs, we're able to easily build exactly what we want without any additional overhead. Also, since we're often understaffed, the fact that it's a developer-friendly system helps us a lot. Wagtail has been a very positive experience so far."
The bottom line
Deciding between Django and Drupal comes down to your specific needs and goals, and it’s worth considering the options. That said, based on our 10+ years of experience developing custom websites and web applications, we almost always recommend Django with Wagtail because it’s:
Easier to update and maintain
More straightforward for content managers to learn and use
More efficient with large data sets and complex queries
Less likely to let errors slip through the cracks
If you want to consider Django and whether it will suit your next project, we’d be happy to talk it through and share some advice. Get in touch with us.
Above: Caktus Account Manager Tim Scales gears up for DjangoCon.
We’re looking forward to taking part in the international gathering of Django enthusiasts at DjangoCon 2018, in San Diego, CA. We’ll be there from October 14 - 19, and we’re proud to attend as sponsors for the ninth year! As such, we’re hosting a mini golf event for attendees (details below).
This year’s speakers are impressive, thanks in part to Erin Mullaney, one of Caktus’ talented developers, who volunteered with DjangoCon’s Program Team. The three-person team, including Developer Jessica Deaton of Wilmington, NC, and Tim Allen, IT Director at The Wharton School, reviewed 257 speaker submissions. They ultimately chose the speakers with the help of a rating system that included community input.
“It was a lot of fun reading the submissions,” said Erin, who will also attend DjangoCon. “I’m really looking forward to seeing the talks this year, especially because I now have a better understanding of how much work goes into the selection process.”
Erin and the program team also created the talk schedule. The roster of speakers includes more women and underrepresented communities due to the DjangoCon diversity initiatives, which Erin is proud to support.
What we’re excited about
Erin said she’s excited about a new State of Django panel that will take place on Wednesday, October 17, which will cap off the conference portion of DjangoCon, before the sprints begin. It should be an informative wrap-up session.
Karen Tracey, our Lead Developer and Technical Manager, is looking forward to hearing “Herding Cats with Django: Technical and social tools to incentivize participation” by Sage Sharp. This talk seems relevant to the continued vibrancy of Django's own development, said Karen, since the core framework and various standard packages are developed with limited funding and rely tremendously on volunteer participation.
Our Account Manager Tim Scales is particularly excited about Tom Dyson’s talk, “Here Come The Robots,” which will explore how people are leveraging Django for machine learning solutions. This is an emerging area of interest for our clients, and one of particular interest to Caktus as we grow our areas of expertise.
Follow us on Twitter @CaktusGroup and #DjangoCon to stay tuned on the talks.
Golf anyone?
If you’re attending DjangoCon, come play a round of mini golf with us. Look for our insert in your conference tote bag. It includes is a free pass to a mini golf outing that we’re hosting at Tiki Town Adventure Golf on Tuesday, October 16, at 7:00 p.m. (please RSVP online). The first round of golf is on us! Whoever shoots the lowest score will win a $100 Amazon gift card.*
No worries if you’re not into mini golf! Instead, find a time to chat with us one-on-one during DjangoCon.
*In the event of a tie, the winner will be selected from a random drawing from the names of those with the lowest score. Caktus employees can play, but are not eligible for prizes.