Penn's Coronavirus COVID-19 Update
2020 SAS summer courses will take place remotely. For undergraduates, please check Penn InTouch for updated summer course information. For graduate and professional students, please consult your schools and programs. Summer study abroad remains cancelled.

The College of Liberal and Professional Studies staff is working remotely to comply with University protocols as we navigate the COVID-19 virus. Penn is committed to maintaining a safe campus and workplace for faculty, students, staff, and visitors. While we are not onsite, we are still available Monday through Friday from 9 a.m. - 5 p.m. by phone and online in case you need support: (215) 898-7326 or

Visit, the University's dedicated coronavirus COVID-19 web page, for the latest updates.

Genomics and Bioinformatics

You are here


  • Session A: July 13 – July 23, 2020


  • 9:30 a.m. - 12 p.m.


  • Sciences


  • Vinayak Mathur
Module Description: 

Module Abstract 

In this class we will examine the intersection of biology, computer science, and statistics, with special emphasis on the biological aspects. Advances in Next Generation Sequencing technologies have led to the production of large amounts of sequencing data and a growing need to develop researchers to analyze it. There are a variety of bioinformatics tools available for students to learn, enabling them to participate in authentic research. This course introduces these tools through hands-on training. The course will focus on using analytical methods to understand the features, functions and evolution of genomics. The overall aims of the course are for students to 1) learn the underlying theory behind bioinformatics tools for genomic analysis and to 2) develop an understanding of how the analysis of sequence data informs the study of biology. Students will get an opportunity to work with biological datasets, perform authentic research and contribute to an ongoing research project.

It is required that students bring a laptop to class.

Rough syllabus 

Day   Topic  
1 Introduction to BLAST and Sequencing Technologies
Databases and Genome Annotation
Comparative Genomics
Community Science Project and Data Visualization
Introduction to Galaxy and walk-through
FASTQ Quality Analysis
Gene Ontology Classification 
Metagenome analysis 

The course will be divided into two projects:

Day 1-5: Students work through a pipeline to identify Horizontal Gene Transfer (HGT) in bacteria and bacteriophages. After learning the necessary bioinformatics tools, student pairs will be assigned phage proteins that they will search through the database and look for instances of HGT. The data will be deposited in the Community Science Project database, developed by the Genome Solver team

Day 6-9: Students will be introduced to the GALAXY pipeline and will work with large biological datasets to perform quality analysis, Gene Ontology classifications and Metagenome analysis. 

Learning outcomes 

By the end of the course students will:

  1. Be able to diagram/explain the various types genome sequencing technologies and explain their strengths and weaknesses. 
  2. Gain facility with important general databases, focusing on those housed at the National Center for Biotechnology Information (NCBI). Students will also gain facility with the prokaryotic database at the Joint Genomic Institute (JGI)
  3. Be able to use web tools such as BLAST, MUSCLE, and MEGA6 to examine DNA and protein sequences and to explain in general terms how they work.
  4. Be able to annotate genes in terms of both structure and function.
  5. Be able to compare gene/protein sequences and draw inferences about evolutionary history.
  6. Be able to interpret metagenomic data.
  7. Gain facility in reading and interpreting primary literature.


Penn Summer High School

tuition and fees


Ready to apply for the

summer of your life?


Summer is the best season

to be a Penn student


Penn Summer

3440 Market Street, Suite 100
Philadelphia, PA 19104-3335

(215) 898-7326

Facebook  Twitter  Instagram