15.1 C
London
Friday, July 5, 2024
HomeStatistics TutorialRHow to Select Random Samples in R (With Examples)

How to Select Random Samples in R (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

To select a random sample in R we can use the sample() function, which uses the following syntax:

sample(x, size, replace = FALSE, prob = NULL)

where:

  • x: A vector of elements from which to choose.
  • size: Sample size.
  • replace: Whether to sample with replacement or not. Default is FALSE.
  • prob: Vector of probability weights for obtaining elements from vector. Default is NULL.

This tutorial explains how to use this function to select a random sample in R from both a vector and a data frame.

Example 1: Random Sample from a Vector

The following code shows how to select a random sample from a vector without replacement:

#create vector of data
data #select random sample of 5 elements without replacement
sample(x=data, size=5)

[1] 10 12  5 14  7

The following code shows how to select a random sample from a vector with replacement:

#create vector of data
data #select random sample of 5 elements with replacement
sample(x=data, size=5, replace=TRUE)

[1] 12  1  1  6 14

Example 2: Random Sample from a Data Frame

The following code shows how to select a random sample from a data frame:

#create data frame
df #view data frame 
df

   x  y  z
1  3 12  2
2  5  6  7
3  6  4  8
4  6 23  8
5  8 25 15
6 12  8 17
7 14  9 29

#select random sample of three rows from data frame
rand_df sample(nrow(df), size=3), ]

#display randomly selected rows
rand_df

   x  y  z
4  6 23  8
7 14  9 29
1  3 12  2

Here’s what’s happening in this bit of code:

1. To select a subset of a data frame in R, we use the following syntax: df[rows, columns]

2. In the code above, we randomly select a sample of 3 rows from the data frame and all columns.

3. The end result is a subset of the data frame with 3 randomly selected rows.

It’s important to note that each time we use the sample() function, R will select a different sample since the function chooses values randomly.

In order to replicate the results of some analysis, be sure to use set.seed(some number) so that the sample() function chooses the same random sample each time. For example:

#make this example reproducible
set.seed(23)

#create data frame
df #select random sample of three rows from data frame
rand_df sample(nrow(df), size=3), ]

#display randomly selected rows
rand_df

   x  y  z
5  8 25 15
2  5  6  7
6 12  8 17

Each time you run the above code, the same 3 rows of the data frame will be selected each time. 

Additional Resources

Stratified Sampling in R (With Examples)
Systematic Sampling in R (With Examples)
Cluster Sampling in R (With Examples)

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories