Pinned ·

What's the time Azure Synapse

Azure Synapse is Microsoft's fancy data analytics platform designed to handle all kinds of data analytics tasks. Unfortunately, there are significant deficiencies in how this platform handles datetime data and timezones. Making it incredibly difficult to use Synpase in conjunctio…

Pinned ·

A Workflow for Data Science

Jupyter notebooks are a phenomenal tool for investigating and interacting with data. The interactive nature of the notebooks allows for quick and effective interrogation of the data and more generally the prototyping of an idea. By combining interaction, the documentation, the co…

Pinned ·

Updating Query with Soft Delete for SQLAlchemy 1.4

When working on a database, there are times where we want to delete a row, but at the same time we still want to keep around the record that the particular row was referring to. There can be many reasons for this, however the basic idea is that we want the row to generally be hid…

Pinned ·

Handling Exceptions in Numerical Python

The numerical computing capabilities of the Python ecosystem are incredibly powerful, with the Numpy and Scipy, and Pandas packages providing high performance, well tested tools. Additionally, tools like Dask and Rapids build upon these foundational packages supporting larger dat…

Pinned ·

Failing Hard: How I 'lost' two years of data

Failure is one of those topics that is discussed far less than it should be ---It is hard to tell other people about mistakes you have made--- yet failure is usually a far better teacher than success. This is why I want to share my story of how I spent the first two years of my P…

Pinned ·

A Complete Guide to Docker on Fedora

While there are numerous guides to installing Docker on Fedora, none of the guides leave the installation in a state that I would consider usable. This is intended to be a single complete guide for the setup and configuration of Docker, highlighting the differences that are requi…

Pinned ·

Speed Through Specificity

A common criticism about the Python programming language is that it is slow, often with reference to a benchmark comparing a range of tasks. This criticism is widely addressed with articles by Jake van der Plass and Anthony Shaw being two excellent examples. While I don't disagre…

Pinned ·

Compiling LaTeX on Travis-CI

One of the best parts of the current software development environment is the proliferation of Continuous Integration (CI) services like Travis-CI. These CI services plug into GitHub or other code repositories to automatically run when new code is pushed to a repository. Typically…

Pinned ·

Experi: A tool for computational experiments

Abstract One of the key features of computational experiments is being able to run the experiment over a large variable space. However, in my experience there aren't tools available to assist with this, particularly in the realm of High Performance Computing (HPC), where bash arr…

Pinned ·

Distributing a Hoomd Plugin

A piece of software I have been using in my research is Hoomd, a 'relatively' new package for running Molecular Dynamics (MD) simulations. These MD simulations have the basic premise of throwing hundreds of balls into a box and shaking it to find out what happens. The relative ne…