Understanding how to code in R using R-studio

15402 VIEWS

·

R is a free programming language and software environment which is used for statistical computing and graphics. RStudio is a free and open-source integrated development environment for R. RStudio is usually preferred when working with R because it is not only free, but also provides a very powerful way of organising and manipulating your R windows and files.

Unlike most conventional programming languages, R does not require you to declare variables of a specific type. It stores the data provided in R-objects, and these objects automatically assign a type to the data based on its characteristics and the way it is stored. The various R-objects which are used to store data are: vectors, lists, matrices, arrays, factors, and data frames.

Working with vectors in RStudio

A vector is an r-object which can be used to store a single value or a group of values of the same data type. However, when working with a group of values, you use the c() function in R to combine them. Below shows how vectors can be created to store a group of values:

 > myFirstVector <- c(“Hello”, “1”, “2”) 
> print(myFirstVector)
[1]  “Hello”  “1” 	“2” 

You can extract information from a vector by specifying the index where the information is stored. This is demonstrated below:

 > #The following line of code extracts the first element in the vector 
> print(myFirstVector [1])
 [1] “Hello” 
 Working with lists in RStudio

Unlike vectors, lists can contain multiple data types. To create a list in RStudio, you must use the list() function. Otherwise, R assumes that you want to create a vector. Bellow shows how lists can be created in R:

 > myFirstList <- list(“hey” ,1, 2) 
> print(myFirstList)
[ [1] ]
[ 1 ]	“hey”

[ [2] ] 
[1] 1

[ [3] ]  
[1]   2 

The information in a list can be extracted by specifying the index where the element is located. This is demonstrated below:

  > myFirstList <- list (“hey”,1,2)
> #The line of code below prints the 2nd element in the list 
> print(myFirstList[2]) 
[ [1] ]
[1]   1 

Working with matrices in RStudio

The matrix r-object provides a multi-dimensional method for expressing data. To create a matrix in RStudio, you need the matrix() function. The matrix() function, however, can include these arguments: matrix(data, nrow, ncol, byrow, dimnames). Here is a quick description of each of these arguments:

  1. data: Defines the data used to fill the data elements in the matrix
  2. nrow: Specifies the number of rows to create
  3. ncol: Specifies the number of columns to create
  4. byrow: Defines the arrangement of the vector elements in the matrix. It is a Boolean value.
  5. dimnames: Specifies the names of the dimensions of the matrix.

Below shows how matrices can be created in RStudio:

 > RN = c(“Rowl”, “Row2”, “Row3”) 
> CN = c(“Col1”, “Col2”, “Col3”) 
> MyMatrix <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9) ,nrows=3, dimnames=List (RN, CN) )
> print(MyMatrix) 
	Col1 Col2 Col3 
Row1         1      4       7
Row2          2     5       8
Row3          3     6       9	

You can extract information from a matrix r-object by specifying the index (that is, both row and column index) where the element is located. This is demonstrated below:

 	Col1 Col2 Col3 
Row1         1      4       7
Row2          2     5       8
Row3          3     6       9	
> #The following command prints the element in the 2nd row and 2nd column 
> prints(MyMatrix[2,2]) 
[1] 5 

Working with data frames in RStudio

The data frame r-object is similar in structure to a matrix or a two-dimensional array since they have rows and columns. However, they differ in the sense that each column in the data frame represents a variable, and each row in the data frame represents an example or instance of that variable. To create a data frame in RStudio, first create different variables to represent the various columns in the data frame and assign the corresponding group of information to each variable. In doing this, ensure that all the columns have the same length. After assigning the various groups of information to the variables, parse these variables as arguments in the data.frame() function to create the data frame. This is demonstrated below.

  Names <- c("David","Stephan","Joachim")
> Age <- c(19,22,20)
> CourseofStudy <- c("Computer Science","Computer Engineering","Electrical Engineering")
> StudentData <- data.frame(Name = Names, Age=Age, Course = CourseofStudy) 
> print(StudentData)
	Name 		Age			Course
1 	David  		19 	      Computer Science 
2 	Stephan	22	Computer Engineering
3	Joachim	20	Electrical Engineering 
>

To extract any row from the data frame, you must specify the index of the row that you want to extract. This is demonstrated below:

 > print( StudentData[1] )
Name
1 	David
2      Stephan 
3      Joachim
> 

To extract any row from the data frame, you need to specify the index of the row you want, followed by a comma. This is demonstrated below:

 > print(StudentData[2, ]) 
       Name   Age 		    Course
2  Stephan    22  Computer engineering 
> 

To interact with the vectors containing the column values directly, use the $ operator. This is demonstrated below:

 > print(StudentData$Course) 
[1] Computer Science		Computer Engineering	Electrical	Engineering
Levels: Computer Engineering Computer Science Electrical Engineering 

To help ascertain the data structure of the data frame, use the str() function. This is demonstrated below:

 > print (str (StudentData) )  
‘Data.frame ‘ :	3 obs. 	 of 	3 variables: 
$ Name  :  Factor w/ 3 levels  “david”,”Joachim”,...: 1 2 3 
$ Age	   ; num 19 22 20 
$ Course : Factor w/ 3 levels “computer Engineering”,..: 2 1 3 

To obtain a statistical summary of the information in your data frame, use the summary() function. This is demonstrated below:

 > summary(StudentData) 
Name		Age					Course 
David	:1 	Min.	:19.00  	Computer Engineering   :1
Joachim:1	1st Qu. :19.50		Computer Science	    :1 
Stephan:1 	median :20.00		Electrical Engineering	    :1 
                Mean	:20.33
                3rd Qu. :21.00
                Max. 	:22.00

Decision structures in R

Using the if statement

The if statement in R allows the user to execute a block of statements if a particular statement is true. This is demonstrated below:

 
> A <- 2
> B <- 3
> if (A < B) { 
+ print (“true”)
+ }
[1]  “true”   

Using the if – else statement

The if - else statement in R allows the user to execute a block of statements if a particular statement is true, or execute a separate block of instructions if that particular statement is false. This is demonstrated below:

 
> A <- 2
> B <- 3
> if (A > B) { 
+ print (“true”)
+ }else{
+ print(“false”)
+ }
[1]  “false”  

Using the switch statement

The switch statement allows R to produce an output that is dependent upon a particular input value. This is demonstrated below:

 > Num <- as. integer(readline(“Enter a number between 1 and 5:  “))
Enter a number between 1 and 5: 4 
> Result <- switch(Num, “One”, “2”, “ It’s Three!”, “Almost There!”, “Done!”) 
> Print( Result)
[1] “Almost There!” 

Loops in R

The Repeat Loop

The Repeat Loop is a loop in R that continues to execute a block of statements until it is broken out of with the break statement. This is demonstrated below:

 > Count <- 1
> repeat
+ { 
+	print (Count)
+ 	if   ( count > 5) 
+ 	{
+		break
+	}
+	count <- count + 1 
+ } 
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6 

The While Loop

The While Loop is a loop in R that continues to execute a block of statements until a particular condition is met. This is demonstrated below:

 > Count <- 1
> while ( Count <= 6) 
+ { 
+	print (count)
+	count <- count + 1 
+ } 
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6 

The For Loop

The For Loop is a loop in R that executes for a specified number of times, executing a specified block of statements in each iteration. This is demonstrated below:

 > MyStrings <- c("This", "is", "a", "string.")
> for (AString in MyStrings)
+ {
+      print(AStrings)
+ }
[1] "This"
[1] "is"
[1] "a"
[1] "string."

Conclusion

I hope that this tutorial has given you all the basic skills you need to help you create very simple R programs using the RStudio software.


David Sasu is a senior studying Computer Science in Ashesi University. He is passionate about understanding technology and using it to solve important problems. He is currently working on the creation of information systems for under-funded orphanages in his country, Ghana. He hopes to specialize in the fields of Artificial Intelligence and cybersecurity to enable him to create systems to help safeguard and improve the African continent.


Discussion

Leave a Comment

Your email address will not be published. Required fields are marked *

Menu
Skip to toolbar