Questions tagged [tidyverse]
ONLY use this tag if your question relates to the installation, integration with your system, or inclusion of the entire tidyverse library. DO NOT USE if your question relates to one or two components of the tidyverse, such as dplyr or ggplot2. Use *those* tags, and tag with `r` as well for a better response.
tidyverse
10,143
questions
34
votes
7
answers
13k
views
combine rows in data frame containing NA to make complete row
I know this is a duplicate Q but I can't seem to find the post again
Using the following data
df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4))
A B C ...
33
votes
1
answer
12k
views
Duplicating (and modifying) discrete axis in ggplot2
I want to duplicate the left-side Y-axis on a ggplot2 plot onto the right side, and then change the tick labels for a discrete (categorical) axis.
I've read the answer to this question, however as ...
31
votes
3
answers
4k
views
data.table equivalent of tidyr::complete()
tidyr::complete() adds rows to a data.frame for combinations of column values that are missing from the data. Example:
library(dplyr)
library(tidyr)
df <- data.frame(person = c(1,2,2),
...
29
votes
5
answers
9k
views
tidyverse - prefered way to turn a named vector into a data.frame/tibble
Using the tidyverse a lot i often face the challenge of turning named vectors into a data.frame/tibble with the columns being the names of the vector.
What is the prefered/tidyversey way of doing this?...
28
votes
4
answers
37k
views
pivot_longer into multiple columns
I am trying to use pivot_longer. However, I am not sure how to use names_sep or names_pattern to solve this.
dat <- tribble(
~group, ~BP, ~HS, ~BB, ~lowerBP, ~upperBP, ~lowerHS, ~upperHS, ~...
28
votes
3
answers
12k
views
R purrr:::pmap: how to refer to input arguments by name?
I am using R purrr:::pmap with three inputs. It is not clear how I can refer explicitly to these inputs in the formula call? When using map2, the formula call goes as ~ .x + .y. But how to do when ...
28
votes
1
answer
10k
views
How do {{}} double curly brackets work in dplyr?
I saw Hadley's talk at RConf and he mentioned using double brackets for calling variables in tidy evals.
I searched Google but I couldn't find anything talking about when to use them.
What's the use ...
26
votes
10
answers
2k
views
Canonical tidyverse method to update some values of a vector from a look-up table
I frequently need to recode some (not all!) values in a data frame column based off of a look-up table. I'm not satisfied by the ways I know of to solve the problem. I'd like to be able to do it in a ...
26
votes
3
answers
34k
views
Having trouble viewing more than 10 rows in a tibble [duplicate]
First off - I am a beginner at programming and R, so excuse me if this is a silly question. I am having trouble viewing more than ten rows in a tibble that is generated from the following code.
The ...
26
votes
2
answers
51k
views
Error in bind_rows_(x, .id) : Argument 1 must have names
Here is a code snippet:
y <- purrr::map(1:2, ~ c(a=.x))
test1 <- dplyr::bind_rows(y)
test2 <- do.call(dplyr::bind_rows, y)
The first call to bind_rows (test1) generates the error
Error in ...
25
votes
9
answers
11k
views
How to name the list of the group_split output in dplyr
I have the following process which uses group_split of dplyr:
library(tidyverse)
set.seed(1)
iris %>% sample_n(size = 5) %>%
group_by(Species) %>%
group_split()
The result is:
[[...
24
votes
1
answer
14k
views
round_any equivalent for dplyr?
I am trying to make a switch to the "new" tidyverse ecosystem and try to avoid loading the old packages from Wickham et al. I used to rely my coding previously. I found round_any function from plyr ...
24
votes
5
answers
10k
views
R: convert to factor with order of levels same with case_when
When doing data analysis, I sometimes need to recode values to factors in order to carry out groups analysis. I want to keep the order of factor same as the order of conversion specified in case_when. ...
23
votes
1
answer
637
views
Order of operations in summarise [closed]
What is happening in the first line of code and why does the result differ from the two next results?
library(tidyverse)
library(magrittr)
data.frame(A=c(2,2),B=c(1,1)) %>%
summarise(A = sum(A)...
22
votes
2
answers
18k
views
Replace NA on numeric columns with mutate_if and replace_na
I would like to replace NAs in numeric columns using some variation of mutate_if and replace_na if possible, but can't figure out the syntax.
df <-tibble(
first = c("a", NA, "b"),
second = ...
21
votes
4
answers
7k
views
Filling missing dates in a grouped time series - a tidyverse-way?
Given a data.frame that contains a time series and one or ore grouping fields. So we have several time series - one for each grouping combination.
But some dates are missing.
So, what's the easiest (...
20
votes
6
answers
23k
views
Remove an element of a list by name
I'm working with a long named list and I'm trying to keep/remove elements that match a certain name, within a tidyverse context, similar to
dplyr::select(contains("pattern"))
However, I'm having ...
20
votes
4
answers
1k
views
How to obtain a position of last non-zero element
I've got a binary variable representing if event happened or not:
event <- c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0)
I need to obtain a variable that would indicate the time when the last ...
20
votes
2
answers
4k
views
Refering to column names inside dplyr's across()
Is it possible to refer to column names in a lambda function inside across()?
df <- tibble(age = c(12, 45), sex = c('f', 'f'))
allowed_values <- list(age = 18:100, sex = c("f", "m"))
df %>%
...
20
votes
6
answers
3k
views
Does a multi-value purrr::pluck exist?
Seems like a basic question and perhaps I'm just missing something obvious ... but is there any way to pluck a sublist (with purrr)?
More specifically, here's an initial list:
l <- list(a = "foo", ...
18
votes
8
answers
31k
views
Removing suffix from column names using rename_all?
I have a data frame with a number of columns in a form var1.mean, var2.mean. I would like to strip the suffix ".mean" from all columns that contain it. I tried using rename_all in conjunction with ...
18
votes
7
answers
3k
views
Is is possible to convert a dataframe object to a tribble constructor?
I have data that looks like this:
library(tidyverse)
df <- tibble(
x = c(0, 179, 342, 467, 705, 878, 1080, 1209, 1458, 1639, 1805, 2000, 2121, 2339, 2462, 2676,
2857, 3049, 3227, 3403, ...
18
votes
1
answer
85k
views
ggplot 'non-finite values' error
I have an R dataframe (df) that looks like this:
blogger; word; n; total
joe; dorothy; 17; 718
paul; sheriff; 10; 354
joe; gray; 9; 718
joe; toto; 9; 718
mick; robin; 9; 607
paul; robin; 9; 354
...
...
17
votes
5
answers
11k
views
Conditional replacement of column name in tibble using dplyr
I have the following tibble:
df <- structure(list(gene_symbol = c("0610005C13Rik", "0610007P14Rik",
"0610009B22Rik", "0610009L18Rik", "0610009O20Rik", "0610010B08Rik"
), foo.control.cv = c(1....
17
votes
3
answers
10k
views
Pass multiple functions to purrr:map
I would like to pass multiple functions at once to one purrr::map call, where the functions need some arguments. As pseudo code:
funs <- c(median, mean)
mtcars %>%
purrr::map(funs, na.rm = ...
17
votes
6
answers
4k
views
set missing values for multiple labelled variables
How to I set missing values for multiple labelled vectors in a data frame. I am working with a survey dataset from spss. I am dealing with about 20 different variables, with the same missing values. ...
17
votes
2
answers
7k
views
Controlling decimal places displayed in a tibble. Understanding what pillar.sigfig does
I have a csv file weight.csv with the following contents.
weight,weight_selfreport
81.5,81.66969147005445
72.6,72.59528130671505
92.9,93.01270417422867
79.4,79.4010889292196
94.6,96.64246823956442
80....
16
votes
3
answers
41k
views
Filter by multiple patterns with filter() and str_detect()
I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the ...
16
votes
2
answers
2k
views
Why does !! (bang-bang) combined with as.name() give a different output compared to !! or as.name() alone?
I use a dynamic variable (eg. ID) as a way to reference a column name that will change depending on which gene I am processing at the time. I then use case_when within mutate to create a new column ...
16
votes
3
answers
2k
views
What are helpful optimizations in R for big data sets?
I built a script that works great with small data sets (<1 M rows) and performs very poorly with large datasets. I've heard of data table as being more performant than tibbles. I'm interested to ...
16
votes
2
answers
10k
views
summarise_at using different functions for different variables
When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance:
library(tidyverse)
df <- tribble(
~category, ~x,...
16
votes
2
answers
17k
views
What is the difference between as.tibble(), as_data_frame(), and tbl_df()?
I remember reading somewhere that as.tibble() is an alias for as_data_frame(), but I don't know what exactly an alias is in programming terminology. Is it similar to a wrapper?
So I guess my question ...
16
votes
3
answers
649
views
Using standard evaluation and do_ to run simulations on a grid of parameters without do.call
Goals
I want to use dplyr to run simulations on grids of parameters. Specifically, I'd like a function that I can use in another program that
gets passed a data.frame
for every row calculates some ...
15
votes
7
answers
749
views
tidyverse: binding list elements of same dimension
Using reduce(bind_cols), the list elements of same dimension may be combined. However, I would like to know how to combine only same dimension (may be specified dimesion in some way) elements from a ...
15
votes
5
answers
46k
views
tidyverse not loaded, it says "namespace ‘vctrs’ 0.2.0 is already loaded, but >= 0.2.1 is required"
strong textI keep encountering problems with installing tidyverse package, which preventing me from implementing many text processing tasks. The problem is the same as those mentioned in many previous ...
15
votes
1
answer
8k
views
Use filter() (and other dplyr functions) inside nested data frames with map()
I'm trying to use map() of purrr package to apply filter() function to the data stored in a nested data frame.
"Why wouldn't you filter first, and then nest? - you might ask.
That will work (and I'll ...
15
votes
3
answers
1k
views
Combining Rolling Origin Forecast Resampling and Group V-Fold Cross-Validation in rsample
I would like to use the R package rsample to generate resamples of my data.
The package offers the function rolling_origin to produce resamples that keep the time series structure of the data. This ...
15
votes
2
answers
16k
views
How to Transpose (t) in the Tidyverse Using Tidyr
Using the sample data (bottom), I want to use the code below to group and summarise the data. After this, I want to transpose, but I'm stuck on how to use tidyr to achieve this?
For context, I'm ...
15
votes
4
answers
9k
views
How to find which polygon a point belong to via sf
I have a sf object that contains polygon information (precincts) for a metro area, obtained through a .shp file. For a given lat/lon pair, I want to determine which precinct it belongs to. I'm ...
15
votes
0
answers
2k
views
When should we use curly brackets { } when piping with dplyr [duplicate]
I found out that some expressions can only be piped if inside curly brackets (braces, { }), for instance:
library(dplyr)
3 %>% {3 + .}
3 %>% {ifelse(. < 2, TRUE, FALSE)}
What are the rules ...
14
votes
2
answers
7k
views
How to specify columns to exclude when retaining all distinct rows?
How do you retain all distinct rows in a data frame excluding certain columns by specifying only the columns you want to exclude. In the example below
library(dplyr)
dat <- data_frame(
x = c("...
14
votes
3
answers
2k
views
pivot_wider when there's no value column
I'm trying to reshape a dataset from long to wide. The following code works, but I'm curious if there's a way not to provide a value column and still use pivot_wider. In the following example, I have ...
14
votes
1
answer
13k
views
ggplot using grouped date variables (such as year_month)
I feel like this should be an easy task for ggplot, tidyverse, lubridate, but I cannot seem to find an elegant solution.
GOAL: Create a bar graph of my data aggregated/summarized/grouped_by year and ...
14
votes
1
answer
2k
views
Evaluation Error when tidyverse is loaded after Hmisc
I am using r 3.3.3, dplyr 0.7.4, and Hmisc 4.1-1. I noticed that the order I load packages effects whether or not a dplyr::summaries function wold work or not. I understand that loading packages in a ...
14
votes
1
answer
2k
views
R ggrides package drawing mean line (NOT MEDIAN)
I'd like to draw a line through my rideplots for the mean. The built-in quantile arguments draw a line in the style I want, but at the median. How can I draw one at the mean, preferably without using ...
14
votes
2
answers
6k
views
R Googlsheets: Unable to use `gs_auth()` in googlesheets package - Sign In With Google Temporarily Disabled App Not Verified Issue
I am unable to authenticate my googlesheets package. Everytime I run the gs_auth() command I am taken to the chrome where I would usually login to enable the package to access my googlesheets:
...
14
votes
4
answers
2k
views
Error installing tidyr on Ubuntu 18.04 & R 4.0.2
In trying to install the package tidyverse, I get errors in the installation of dependency tidyr.
Here is the tail of the message I get:
cpp11.cpp:31:100: error: ‘unmove’ is not a member of ‘cpp11’
...
14
votes
3
answers
2k
views
using tidyverse; counting after and before change in value, within groups, generating new variables for each unique shift
I am looking for a tidyverse-solution that can count occurrences of unique values of TF within groups, id in the data datatbl. When TF changes I want to count both forward and backwards from that ...
14
votes
2
answers
2k
views
base R faster than readr for reading multiple CSV files
There is a lot of documentation on how to read multiple CSVs and bind them into one data frame. I have 5000+ CSV files I need to read in and bind into one data structure.
In particular I've followed ...
13
votes
4
answers
15k
views
How to install Tidyverse on Ubuntu 16.04 and 17.04
I'm running Ubuntu 16.04 [now 17.04: see note in bold below] and R 3.4.1. I installed the latter this morning, so I presume it's the latest version. I want to install Tidyverse, which I've spent many ...