Get in Touch

Course Outline

I. Introduction and Preliminaries

1. Course Overview

  • Enhancing R's accessibility: R and available Graphical User Interfaces (GUIs)
  • RStudio
  • Complementary software and documentation resources
  • The relationship between R and statistics
  • Interactive usage of R
  • Initial orientation session
  • Accessing help for functions and features
  • R syntax, including case sensitivity and other conventions
  • Retrieving and correcting previous commands
  • Executing commands from files and redirecting output
  • Managing data persistence and removing objects
  • Best practices in programming: Self-contained scripts, readability (e.g., structured scripts, documentation, markdown)
  • Installing packages; understanding CRAN and Bioconductor

2. Data Import

  • Text files (using read.delim)
  • CSV files

3. Basic Manipulations: Numbers, Vectors, and Arrays

  • Understanding vectors and assignment operations
  • Vector arithmetic
  • Creating regular sequences
  • Working with logical vectors
  • Handling missing values
  • Character vectors
  • Index vectors: Selecting and modifying data subsets
    • Arrays
  • Array indexing and accessing subsections
  • Index matrices
  • The array() function and basic array operations (e.g., multiplication, transposition)
  • Other object types

4. Lists and Data Frames

  • Lists
  • Constructing and modifying lists
    • Concatenating lists
  • Data frames
    • Creating data frames
    • Manipulating data frames
    • Attaching arbitrary lists
    • Managing the search path

5. Data Manipulation Techniques

  • Selecting, subsetting observations and variables
  • Filtering and grouping data
  • Recoding data and applying transformations
  • Aggregation and merging data sets
  • Creating partitioned matrices using cbind() and rbind()
  • Concatenation functions with arrays
  • String manipulation using the stringr package
  • Introduction to grep and regexpr

6. Advanced Data Import

  • Excel files (XLS, XLSX)
  • The readr and readxl packages
  • Importing data from SPSS, SAS, Stata, and other formats
  • Exporting data to txt, csv, and other formats

7. Grouping, Loops, and Conditional Execution

  • Grouped expressions
  • Control statements
  • Conditional execution: if statements
  • Repetitive execution: for loops, repeat, and while loops
  • Introduction to apply, lapply, sapply, and tapply functions

8. Creating Functions

  • Defining functions
  • Optional arguments and default values
  • Handling a variable number of arguments
  • Understanding scope and its implications

9. Basic Graphics in R

  • Creating graphs
  • Density plots
  • Dot plots
  • Bar plots
  • Line charts
  • Pie charts
  • Boxplots
  • Scatter plots
  • Combining multiple plots

II. Statistical Analysis in R

1. Probability Distributions

  • R as a repository of statistical tables
  • Examining data set distributions

2. Hypothesis Testing

  • Tests concerning population means
  • Likelihood ratio tests
  • One-sample and two-sample tests
  • Chi-square goodness-of-fit test
  • Kolmogorov-Smirnov one-sample statistic
  • Wilcoxon signed-rank test
  • Two-sample tests
  • Wilcoxon rank sum test
  • Mann-Whitney test
  • Kolmogorov-Smirnov test

3. Multiple Hypothesis Testing

  • Type I error and False Discovery Rate (FDR)
  • Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC)
  • Multiple testing procedures (e.g., Benjamini-Hochberg, Bonferroni)

4. Linear Regression Models

  • Generic functions for extracting model information
  • Updating fitted models
  • Generalized linear models (GLMs)
    • Families
    • The glm() function
  • Classification techniques
    • Logistic regression
    • Linear discriminant analysis
  • Unsupervised learning
    • Principal Component Analysis (PCA)
    • Clustering methods (k-means, hierarchical clustering, k-medoids)

5. Survival Analysis (using the survival package)

  • Survival objects in R
  • Kaplan-Meier estimates, log-rank tests, and parametric regression
  • Confidence bands
  • Analysis of censored (interval-censored) data
  • Cox Proportional Hazards (PH) models with constant covariates
  • Cox PH models with time-dependent covariates
  • Simulation: Model comparison techniques

6. Analysis of Variance (ANOVA)

  • One-way ANOVA
  • Two-way classification in ANOVA
  • MANOVA

III. Applied Problems in Bioinformatics

  • Introduction to the limma package
  • Workflow for microarray data analysis
  • Data retrieval from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
  • Data processing steps (Quality Control, normalization, differential expression analysis)
  • Generating volcano plots
  • Clustering examples and heatmaps
 28 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories