Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
I. Introduction and Preliminaries
1. Course Overview
- Enhancing R's accessibility: R and available Graphical User Interfaces (GUIs)
- RStudio
- Complementary software and documentation resources
- The relationship between R and statistics
- Interactive usage of R
- Initial orientation session
- Accessing help for functions and features
- R syntax, including case sensitivity and other conventions
- Retrieving and correcting previous commands
- Executing commands from files and redirecting output
- Managing data persistence and removing objects
- Best practices in programming: Self-contained scripts, readability (e.g., structured scripts, documentation, markdown)
- Installing packages; understanding CRAN and Bioconductor
2. Data Import
- Text files (using read.delim)
- CSV files
3. Basic Manipulations: Numbers, Vectors, and Arrays
- Understanding vectors and assignment operations
- Vector arithmetic
- Creating regular sequences
- Working with logical vectors
- Handling missing values
- Character vectors
- Index vectors: Selecting and modifying data subsets
- Arrays
- Array indexing and accessing subsections
- Index matrices
- The array() function and basic array operations (e.g., multiplication, transposition)
- Other object types
4. Lists and Data Frames
- Lists
- Constructing and modifying lists
- Concatenating lists
- Data frames
- Creating data frames
- Manipulating data frames
- Attaching arbitrary lists
- Managing the search path
5. Data Manipulation Techniques
- Selecting, subsetting observations and variables
- Filtering and grouping data
- Recoding data and applying transformations
- Aggregation and merging data sets
- Creating partitioned matrices using cbind() and rbind()
- Concatenation functions with arrays
- String manipulation using the stringr package
- Introduction to grep and regexpr
6. Advanced Data Import
- Excel files (XLS, XLSX)
- The readr and readxl packages
- Importing data from SPSS, SAS, Stata, and other formats
- Exporting data to txt, csv, and other formats
7. Grouping, Loops, and Conditional Execution
- Grouped expressions
- Control statements
- Conditional execution: if statements
- Repetitive execution: for loops, repeat, and while loops
- Introduction to apply, lapply, sapply, and tapply functions
8. Creating Functions
- Defining functions
- Optional arguments and default values
- Handling a variable number of arguments
- Understanding scope and its implications
9. Basic Graphics in R
- Creating graphs
- Density plots
- Dot plots
- Bar plots
- Line charts
- Pie charts
- Boxplots
- Scatter plots
- Combining multiple plots
II. Statistical Analysis in R
1. Probability Distributions
- R as a repository of statistical tables
- Examining data set distributions
2. Hypothesis Testing
- Tests concerning population means
- Likelihood ratio tests
- One-sample and two-sample tests
- Chi-square goodness-of-fit test
- Kolmogorov-Smirnov one-sample statistic
- Wilcoxon signed-rank test
- Two-sample tests
- Wilcoxon rank sum test
- Mann-Whitney test
- Kolmogorov-Smirnov test
3. Multiple Hypothesis Testing
- Type I error and False Discovery Rate (FDR)
- Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC)
- Multiple testing procedures (e.g., Benjamini-Hochberg, Bonferroni)
4. Linear Regression Models
- Generic functions for extracting model information
- Updating fitted models
- Generalized linear models (GLMs)
- Families
- The glm() function
- Classification techniques
- Logistic regression
- Linear discriminant analysis
- Unsupervised learning
- Principal Component Analysis (PCA)
- Clustering methods (k-means, hierarchical clustering, k-medoids)
5. Survival Analysis (using the survival package)
- Survival objects in R
- Kaplan-Meier estimates, log-rank tests, and parametric regression
- Confidence bands
- Analysis of censored (interval-censored) data
- Cox Proportional Hazards (PH) models with constant covariates
- Cox PH models with time-dependent covariates
- Simulation: Model comparison techniques
6. Analysis of Variance (ANOVA)
- One-way ANOVA
- Two-way classification in ANOVA
- MANOVA
III. Applied Problems in Bioinformatics
- Introduction to the limma package
- Workflow for microarray data analysis
- Data retrieval from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
- Data processing steps (Quality Control, normalization, differential expression analysis)
- Generating volcano plots
- Clustering examples and heatmaps
28 Hours
Testimonials (2)
knowledge of the trainer, tailor based, all topics covered
eleni - EUAA
Course - Forecasting with R
The real life applications using Statcan and CER as examples.