digital

Transformers 3: Building and training a Transformer

Printworks, London Transformers 3: Building and training a Transformer Having discussed how attention works and the structure of Transformers we’ll now implement a simple Transformer that translates German text into English. To do this we’ll take the attention mechanism and other network components that Andrej Karpathy developed in nanoGPT for language generation and reuse them to implement a Transformer for language translation. This uses the PyTorch framework. The language translation Transformer’s structure follows the example in François Chollet’s book ‘Deep Learning with Python’ which is written in Keras/TensorFlow....

Cloud 6: Introduction to Infrastructure as Code using CloudFormation

Willis tower, Chicago, August 2023 Introduction to Infrastructure as Code - deploying a virtual machine In previous posts we spun up virtual machines from the consoles of cloud providers websites. However this has the disadvantage that it can be time consuming and the steps involved complicated and easy to forget. By contrast directly specifying the cloud computing that we want to access in code is more explicit and easier to reuse - it can also be version controlled....

Transformers 2: The Transformer's Structure

Helix, White City Transformers 2: The structure of a Transformer Continuing the series of notes on Transformers post I’ll now look at the overall structure of Transformers and how attention is used within them in the context of language translation. I’ll cover: 1. The architecture of Transformers 2. How attention is included in the Transformer The previous post discussed the attention mechanism that underlies how Transformers work. 1. The architecture of Transformers 1....

Transformers 1: The Attention Mechanism

“Let me Unsee” by Asbestos, Belfast, August 2023. The T in GPT In all the excitement around OpenAI’s ChatGPT two things are often not mentioned: the underlying architecture of Generative Pre-trained Transformer (GPT) models, the Transformer, was invented and made public by Google despite these models’ impressive capabilities, and their scale and complexity, the underlying mathematics involved (at least in terms of the components used) is relatively simple...

Cloud 5: Introduction to deploying an app with simple CI/CD

Patrick Staff’s ‘On Venus’ at the Serpentine gallery Deploying a web app with Continuous Integration and Continuous Deployment (CI/CD) Introduction The last post used a Serverless approach to regularly search Twitter for particular words and then stored the corresponding Tweet data in a cloud database. Here we develop a simple Streamlit app that accesses the database and creates a page showing average daily sentiment on the topic. We then deploy the app in the cloud....

Cloud 4: Using Serverless

La Cabane Éclatée aux 4 Salles, Daniel Buren at the Gori collection What is Serverless? In previous posts we spun up virtual machines to work in the cloud, but the cloud can also perform specific functions for us without us having to run any servers, so called serverless computing. The servers providing the services still exist, but they are managed by the cloud provider while we deal with the services directly....

Cloud 3: Docker and Jupyter notebooks in the Cloud

Kings Cross underpass rotated, London In the last post we covered some basic operations with cloud machines, in this post we discuss using Docker containers to replicate environments in the cloud and how to containerise a Jupyter notebook. Docker has become a standard way to package applications enabling something that runs on one machine to run on another. It is a container which holds a virtualisation of an operating system within which applications can be packaged to allow them to move between machines....

Cloud 2: Getting started with using a Virtual Machine in the Cloud

Clouds over the Lea valley, London In the last post we covered setting up a virtual machine in the cloud, logging into it via ssh and securing it. However, this gets us a virtual machine running in the cloud costing us money, it isn’t doing anything useful. To work with the virtual machine there are some other basic things that it’s helpful to be able to do. 1. Getting things on and off the cloud machine...

The Simulation Machine: The digital issue we don’t talk about and its implications for the creative economy

The digital issue we don’t discuss and its implications for the creative economy Computers’ ability to effectively simulate a huge variety of functions has allowed them to replace hi-fis, cameras, maps, DVD players, calculators, typewriters, cash registers, musical instruments, watches, TVs, newspapers, books, shopfronts, money and more. Although the changes caused by the general-purpose power of computation are widely observed, it is arguably less discussed than the role of data and/or the costless copying of information....

Visions and Reality - Can blockchain allow us to rethink the creative industries?

JW Turner’s ‘The Lake of Zug, 1843. In the collection of the Metropolitan Museum of Art. The Zug area of Switzerland is known for its concentration of blockchain activity. What is blockchain? In digital money there is a fundamental problem, how do you know that someone has the right to spend the money they are offering as payment, particularly if (as often happens in financial transactions) you don’t know them. One way to solve this is what happens in the banking system - you have a centralised authority that tracks transactions and verifies a person has the money that they claim, before allowing payment....