Get in Touch

Course Outline

I. Introduction and preliminaries

1. Overview

  • Enhancing the R experience: R and available Graphical User Interfaces (GUIs)
  • Overview of RStudio
  • Related software and documentation resources
  • The relationship between R and statistics
  • Interactive use of R
  • Conducting an introductory session
  • Accessing help for functions and features
  • R commands, case sensitivity, and syntax rules
  • Recalling and correcting previous commands
  • Executing commands from files and directing output
  • Managing data persistence and removing objects
  • Best practices in programming: Creating self-contained scripts, ensuring readability through structured scripts, documentation, and markdown
  • Installing packages via CRAN and Bioconductor

2. Reading data

  • TXT files (using read.delim)
  • CSV files

3. Basic manipulations; numbers, vectors, and arrays

  • Understanding vectors and assignment
  • Vector arithmetic
  • Generating regular sequences
  • Working with logical vectors
  • Handling missing values
  • Character vectors
  • Index vectors: Selecting and modifying data subsets
    • Arrays
  • Array indexing and accessing subsections
  • Index matrices
  • Using the array() function and performing simple operations (e.g., multiplication, transposition)
  • Other object types

4. Lists and data frames

  • Understanding lists
  • Constructing and modifying lists
    • Concatenating lists
  • Data frames
    • Creating data frames
    • Working with data frames
    • Attaching arbitrary lists
    • Managing the search path

5. Data manipulation

  • Selecting, subsetting observations, and variables
  • Filtering and grouping data
  • Recoding and data transformations
  • Aggregation and combining data sets
  • Creating partitioned matrices using cbind() and rbind()
  • Using the concatenation function with arrays
  • Character manipulation using the stringr package
  • Introduction to grep and regexpr

6. Advanced data reading techniques

  • XLS and XLSX files
  • Using readr and readxl packages
  • Importing data from SPSS, SAS, Stata, and other formats
  • Exporting data to TXT, CSV, and other formats

7. Grouping, loops, and conditional execution

  • Grouped expressions
  • Control statements
  • Conditional execution: if statements
  • Repetitive execution: for loops, repeat, and while loops
  • Introduction to apply, lapply, sapply, and tapply functions

8. Functions

  • Creating custom functions
  • Optional arguments and default values
  • Handling variable numbers of arguments
  • Understanding scope and its implications

9. Basic graphics in R

  • Creating a graph
  • Density plots
  • Dot plots
  • Bar plots
  • Line charts
  • Pie charts
  • Boxplots
  • Scatter plots
  • Combining multiple plots

II. Statistical analysis in R

1. Probability distributions

  • Using R as a collection of statistical tables
  • Examining data distribution

2. Hypothesis testing

  • Tests concerning population means
  • Likelihood Ratio Test
  • One-sample and two-sample tests
  • Chi-Square Goodness-of-Fit Test
  • Kolmogorov-Smirnov One-Sample Statistic
  • Wilcoxon Signed-Rank Test
  • Two-Sample Test
  • Wilcoxon Rank Sum Test
  • Mann-Whitney Test
  • Kolmogorov-Smirnov Test

3. Multiple hypothesis testing

  • Type I Error and False Discovery Rate (FDR)
  • ROC curves and Area Under the Curve (AUC)
  • Multiple testing procedures (Bonferroni, Benjamini-Hochberg, etc.)

4. Linear regression models

  • Generic functions for extracting model information
  • Updating fitted models
  • Generalized linear models
    • Families
    • The glm() function
  • Classification techniques
    • Logistic Regression
    • Linear Discriminant Analysis
  • Unsupervised learning methods
    • Principal Components Analysis
    • Clustering Methods (k-means, hierarchical clustering, k-medoids)

5. Survival analysis (survival package)

  • Working with survival objects in R
  • Kaplan-Meier estimates, log-rank test, and parametric regression
  • Calculating confidence bands
  • Analysis of censored (interval censored) data
  • Cox Proportional Hazards (PH) models with constant covariates
  • Cox PH models with time-dependent covariates
  • Simulation: Model comparison techniques

6. Analysis of Variance (ANOVA)

  • One-Way ANOVA
  • Two-Way Classification of ANOVA
  • MANOVA

III. Worked problems in bioinformatics

  • Short introduction to the limma package
  • Microarray data analysis workflow
  • Downloading data from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
  • Data processing steps: Quality control (QC), normalization, and differential expression analysis
  • Creating volcano plots
  • Clustering examples and heatmaps
 28 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories