Skip to content

abidawson/stringr.plus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stringr.plus

Lifecycle: maturing R build status

Overview

stringr.plus provides some additional functions not found in stringr for working with strings. Functions play quite nicely with the tidyverse. The functions are especially useful for working with URLs and file path data to extract specific text from strings.

Installation

You can install the released version of stringr.plus from github with:

# install.packages("remotes")
remotes::install_github("johncassil/stringr.plus")

Examples

Quite frequently when working with urls or filepaths, one would like to strip out everything before or after a pattern or symbol. Sometimes we need to extract the text between patterns:

library(stringr.plus)

url <- 'www.carfax.com/vehicle/3GCPKTE77DG348900'

#What if we need just the base url?
str_extract_before(string = url, pattern = '/')
#> [1] "www.carfax.com"
#If we need just the 'slug' (the part after the base url), we can use:
str_extract_after(string = url, pattern = '/')
#> [1] "vehicle/3GCPKTE77DG348900"
#Or if we need just the last section (the VIN):
str_extract_after(string = url, pattern = 'vehicle/')
#> [1] "3GCPKTE77DG348900"
#Both functions have a num_char argument that can allow specificity for how many characters you want returned:
str_extract_after(string = url, pattern = 'vehicle/', num_char = 5)
#> [1] "3GCPK"
#Additionally, we often are interested in extracting text between two common patterns:
file_path <- "‪C:/Users/pingu/Downloads/a-very-poorly-named-file_08_09_2020.csv"
str_extract_between(string = file_path, pattern1 = '_', pattern2 = ".csv")
#> [1] "08_09_2020"

From time to time, it’s helpful to detect if a string contains more than one pattern, typically for filtering purposes.

#str_detect_multiple with the "and" method ensures that all patterns must be found to return TRUE:
file_path <- "‪C:/Users/pingu/Downloads/a-very-poorly-named-file_08_09_2020.csv"
str_detect_multiple(string = file_path, patterns = c("pingu", "2020"), method = 'and')
#> [1] TRUE
#An easier alias for this is str_detect_multiple_and()
str_detect_multiple_and(string = file_path, patterns = c("Downloads", "csv"))
#> [1] TRUE

Likewise, it’s helpful to detect if a string contains any patterns from a list of patterns.

#str_detect_multiple with the "or" method ensures at least one of patterns must be found to return TRUE:
file_path <- "‪C:/Users/pingu/Downloads/a-very-poorly-named-file_08_09_2020.csv"
str_detect_multiple(string = file_path, patterns = c("very", "purple"), method = 'or')
#> [1] TRUE
#It is also aliased with str_detect_multiple_or()
str_detect_multiple_or(string = file_path, patterns = c("large", "file"))
#> [1] TRUE

These functions inherit properties from and are based on stringr::str_extract(), so patterns can be regular expressions. Consult the documentation for more details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%