In the early months of the COVID-19 pandemic I worked on a tool for comparing genomes in C++. The project is named “vdiff”, after virus diff (like the unix command line tool “diff”).

The vdiff command line tool provides information about two genomes, including the number of matching strings within two genomes, and the percentage of the genomes that are matching when aligned. This information is output via command line logging, as well as optional CSV match output for graphing.

My project vdiff is available on GitHub.

Give it a shot yourself if you are interested the subject. The source code comes with some sample genomes that you can use.

vdiff

This is an example of graphing the CSV output of comparing a SARS genome (x-axis), and a COVID-19 genome (y-axis). More points along the line x=y represents more similarities.

More information and instructions are available in this video: