## Breaking changes for package developers
* The major change in this version is that dplyr now depends on the selecting
backend of the tidyselect package. If you have been linking to
`dplyr::select_helpers` documentation topic, you should update the link to
point to `tidyselect::select_helpers`.
* Another change that causes warnings in packages is that dplyr now exports the
`exprs()` function. This causes a collision with `Biobase::exprs()`. Either
import functions from dplyr selectively rather than in bulk, or do not import
`Biobase::exprs()` and refer to it with a namespace qualifier.
## Bug fixes
* Reindexing grouped data frames never updates the `"class"` attribute. This also avoids unintended updates to the original object (#3438).
* `do()` operations with more than one named argument can access `.` (#2998).
* Summaries of summaries (such as `summarise(b = sum(a), c = sum(b))`) are now
computed using standard evaluation for simplicity and correctness, but
slightly slower (#3233).
* Fixed rare column name clash in joins with non-join columns of the same name in both tables (#3266).
* Fix `summarise()` for empty data frames with zero columns (#3071).
* Fix `row_number()` and `ntile()` ordering to use the locale-dependent ordering functions in R when dealing with character vectors, rather than always using the C-locale ordering function in C (#2792, @foo-bar-baz-qux).
## Changes
* `sym()`, `syms()`, `expr()`, `exprs()` and `enexpr()` are now
exported. `sym()` and `syms()` construct symbols from strings or character
vectors. The `expr()` variants are equivalent to `quo()`, `quos()` and
`enquo()` but return simple expressions rather than quosures. They support
quasiquotation.
* dplyr now depends on the new tidyselect package to power `select()`,
`rename()`, `pull()` and their variants (#2896). Consequently
`select_vars()`, `select_var()` and `rename_vars()` are
soft-deprecated and will start issuing warnings in a future version.
Following the switch to tidyselect, `select()` and `rename()` fully support
character vectors. You can now unquote variables like this:
```
vars <- c("disp", "cyl")
select(mtcars, !! vars)
select(mtcars, -(!! vars))
```
Note that this only works in selecting functions because in other contexts
strings and character vectors are ambiguous. For instance strings are a valid
input in mutating operations and `mutate(df, "foo")` creates a new column by
recycling "foo" to the number of rows.
* Support for raw vector columns in `mutate`, `summarise`, `arrange`, `group_by`
and joins (minimal `raw` x `raw` support initially) (#1803).
* Hybrid evaluation simplifies `dplyr::foo` to `foo` (#3309). Hybrid functions can now be masked by regular R functions to turn off hybrid evaluation (#3255). The hybrid evaluator finds functions from dplyr even if dplyr is not attached (#3456).
* Scoped select and rename functions (`select_all()`, `rename_if()` etc.) now work with grouped data frames, adapting the grouping as necessary (#2947, #3410). `group_by_at` can group by an existing grouping variable (#3351). `arrange_at` can use grouping variables (#3332).
* `row_number()` works on empty subsets (#3454).
* `bind_cols()` handles unnamed list (#3402).
* `select()` and `vars()` now treat `NULL` as empty inputs (#3023).
* Support `!!!` in `recode_factor()` (#3390).
* `distinct()` now supports renaming columns (#3234).
* It is now illegal to use `data.frame` in the rhs of `mutate()` (#3298).
* `combine()` returns `logical()` when all inputs are `NULL` (or when there are no inputs) (#3365, @zeehio).
* `bind_rows()` works around corrupt columns that have the object bit set while having no class attribute (#3349).
* `slice()` no longer enforce tibble classes when input is a simple `data.frame`, and ignores 0 (#3297, #3313).
## Documentation
* Improved documentation for set operations (#3238, @edublancas).
* Improved documentation for `funs()` (#3094).
## Error messages
* Dedicated error message when trying to use columns of the `Interval` or `Period` classes (#2568).
* Show clear error message for bad arguments to `funs()` (#3368).
* Add warning with explanation to `distinct()` if any of the selected columns are of type `list` (#3088, @foo-bar-baz-qux).
* Better error message if dbplyr is not installed when accessing database backends (#3225).
* Corrected error message when calling `cbind()` with an object of wrong length (#3085).
* Better error message when joining data frames with duplicate or `NA` column names. Joining such data frames with a semi- or anti-join now gives a warning, which may be converted to an error in future versions (#3243, #3417).
* `distinct()` now gives a warning when used on unknown columns (#2867, @foo-bar-baz-qux).
* `arrange()` fails gracefully on `data.frame` columns (#3153).
* Added an `.onDetach()` hook that allows for plyr to be loaded and attached without the warning message that says functions in dplyr will be masked, since dplyr is no longer attached (#3359, @jwnorman).
## Performance
* `sample_n()` and `sample_frac()` on grouped data frame are now faster especially for those with large number of groups (#3193, @saurfang).
## Internal
* Compute variable names for joins in R (#3430).
* Bumped Rcpp dependency to 0.12.15 to avoid imperfect detection of `NA` values in hybrid evaluation fixed in RcppCore/Rcpp#790 (#2919).
* Avoid cleaning the data mask, a temporary environment used to evaluate expressions. If the environment, in which e.g. a `mutate()` expression is evaluated, is preserved until after the operation, accessing variables from that environment now gives a warning but still returns `NULL` (#3318).