1 Why do I need a VM?

In the case that you do not have access to a high performance computing cluster, a virtual machine (VM) is a good alternative.

Many bioinformatics processes are computationally intensive and will not be viable on a standard PC or laptop. VMs are essentially more powerful computers that you can connect to through a local machine. You can run bioinformatic processes remotely using VMs which means you don’t need to stay connected or keep your local machine switched on, and when dealing with time-consuming bioinformatics this is ideal.

There are many options online e.g. Digital Ocean, Microsoft Azure, Oracle, and Amazon Web Services. These all offer free versions which might be enough for your needs, but these tend to be limited in their processing power and storage space. For more computationally intensive processes it is worth researching the requirements in memory and storage that you might need. Then you can select a paid VM that suits your needs.

I have been using Digitial Ocean which I like because of its user friendly website for managing the VM. Let’s imagine that I am doing some metabarcoding sequencing file processing. I need a minimum of 16 Gb memory, and 200 Gb of storage space.

2 Digital ocean VM setup

  1. Set up an account at (https://www.digitalocean.com/) and register a card as the payment method.
  2. Create a virtual instance or “droplet” and select the following parameters:
  1. Select a datacentre - choose a geographically close location to avoid latency issues
  2. Select the VM image with the best compatibility for your desired process e.g. “Ubuntu v22.04 (LTS) x64”
  3. Select a basic droplet package with 16 Gb RAM - or a more intensive VM if needed
  4. Select automatic mounting
  5. Select the file sysytem “ext4”
  6. Set up a password for connecting to the droplet
  7. Choose a name e.g. “ubuntu-vm”
  1. Once created, additional storage space can be mounted by navigating to the droplet and selecting “mount volume” - e.g. add 200 Gb storage

3 Connecting to the VM through the windows powershell

Open the windows powershell by pressing (Windows key + x) and then select PowerShell. For MacOS and LINUX, open the terminal.

Connect using the following command:

ssh root@ipforyourvm

Enter the password set during VM creation.

4 Miniconda

Miniconda is required to create environments for a lot of bioinformatic processes and can be installed as follows:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Enter ‘yes’ to both questions.

For changes to register, will have to shut down the LINUX system or logout of a VM, and then restart or re-connect.

5 Final thoughts

Now you have set up a VM and you can connect to it. In this space you can begin installing the packages you need to run whatever bioinformatics processes you like!

When you are not using the VM, make sure you switch it off from your DigitalOcean (or other provider) account to minimise the costs.