Bioinformatics

DNA Sequencing Techniques

Learn about the different types of techniques used to sequence DNA. We'll go over everything from maxam gilbert sequencing, sanger ddNTP sequencing to shotgun sequencing.Get a gist of what DNA sequencing is, what types of molecules can be sequenced, what breadth of information comes out of DNA sequencers, and the types of sequencing approaches available. Learn about the very first DNA sequencing technique pioneered by Alan Maxam and Walter Gilbert in 1976. Learn how Sanger sequencing works, which relies heavily on the chemical properties of dideoxynucleotides (ddNTPs). Also learn the pros and cons of the technique. Learn how scientists deal with the disadvantage of having short reads by utilizing a method known as primer walking. Learn how to get around the limitations of pyrosequencing by incorporating all four nucleotides in one run with reversible chain terminators. Learn the technique they used to sequence genomes that haven't been sequenced before (de novo) with the shotgun sequencing method.

Next-Generation Sequencing

Learn about how Next-Generation Sequencing techniques are used today to rapidly sequence billions of DNA base pairs for low costs. Learn what Next-Generation Sequencing (NGS) technology is, and what it means. Learn about emulsion PCR (ePCR), one of the PCR techniques used in next-generation sequencing. Thus far we have learned how to sequence DNA using sequencing by synthesis methods. Let's now learn how to sequence a DNA strand by ligation methods, which binds 8 to 9 bases at a time instead of just 1. Learn about the cheap, open-source polony sequencing works, created by George Church's group at Harvard Medical School. Learn about one of the very first second generation sequencing technologies. Pyrosequencing uses sequencing by synthesis and utilizes pyrophosphate as a means to detect whether a dXTP is present. Learn how the Ion Torrent machine works, and about the semiconductor sequencing technique. Learn how the ISFET sensor is used to as essentially a pH meter that detects when a dNTP is added to a growing strand of DNA. Learn about bridge PCR, another way DNA sequencers are amplifying their DNA in Next-Generation Sequencing devices. Learn how Illumina uses bridge PCR and sequencing-by-synthesis to sequence DNA in their Next-Generation Sequencing machines.

Pairwise Alignment

Learn about how biological data is stored and transferred with different homology, scoring matrices and the global and local alignment algorithms.In this lesson, we'll go through what sequence / pairwise alignment is, how they are used in bioinformatics, look at PAM and BLOSUM matrices used to score alignments, and look at the techniques / algorithms used. Learn how to qualitatively describe two sequences that have a common ancestor the two terms of homology - orthology vs. parology. Homologs, orthologs and paralogs arise in gene duplication and speciation. Learn how to quantitatively describe how well two sequences are aligned with the identity and similarity (positives) parameters which are part of interpreting BLAST results. A beginner\'s guide on how to use NCBI protein BLAST, a powerful program used for local alignment. Let\'s look at how to perform pairwise alignments and search databases for a specific query. Learn about the Dayhoff model, which is used to score amino acid substitutions. Also find out about accepted point mutations (PAM) scoring matrices PAM1 and PAM250. Learn how to score gaps to perform analysis in pairwise alignments. Learn what the default scoring matrix for BLAST is - BLOSUM62. Find out how to construct one as a substitution matrix used to score pairwise alignments - BLOSUMs. Compare and find the difference between PAM and BLOSUM scoring and substitution matrices. In this tutorial, you'll learn how to use the Needleman-Wunsch algorithm to create a matrix and find the optimal alignment between two sequences. Learn how the algorithm behind local alignment works with the Smith and Waterman algorithm. 

Sequence File Formats

Learn about the different types of file formats that come with sequencing DNA and proteins. We'll go over FASTA, FASTQ, SAM, BAM, CRAM, BED format and more!Learn about the most basic file formats, including csv, tsv, and markdown. Learn what the multi-FASTA and FASTA formats are, their extensions (fas, fna, faa, ffn, frn), how to convert to and from FASTA and obtaining them from the NCBI database. Learn about FASTQ format, which is similar to FASTA, but includes quality scores. Learn about SAMtools, and the three file formats it generates - SAM, BAM and CRAM. Learn about the BED file, which are used to customize data lines on a genome browser such as UCSC browse, Galaxy browser, and bedtools. Learn about the Wig and BigWig formats, used to store dense continuous data such as GC percent and probability scores. Learn about the general feature format (GFF) and the general transfer format (GTF). earn about the different conversion tools used to convert among file types.