Package 'mgsub'

Title: Safe, Multiple, Simultaneous String Substitution
Description: Designed to enable simultaneous substitution in strings in a safe fashion. Safe means it does not rely on placeholders (which can cause errors in same length matches).
Authors: Mark Ewing [aut, cre]
Maintainer: Mark Ewing <[email protected]>
License: MIT + file LICENSE
Version: 1.7.3
Built: 2025-01-10 03:00:00 UTC
Source: https://github.com/bmewing/mgsub

Help Index


mgsub_censor worker

Description

The hard worker doing everything for mgsub_censor

Usage

censor_worker(
  string,
  pattern,
  censor,
  split = any(nchar(censor) > 1),
  seed = NULL,
  ...
)

Arguments

string

a character vector where replacements are sought

pattern

Character string to be matched in the given character vector

censor

character to use in censoring - see details

split

if a multicharacter censor pattern is provided, should it be split to preserve original string length

seed

optional parameter to fix sampling of multicharacter censors

...

arguments to pass to regexpr family


Fast escape replace

Description

Fast escape function for limited case where only one pattern provided actually matches anything

Usage

fast_replace(string, pattern, replacement, ...)

Arguments

string

a character vector where replacements are sought

pattern

Character string to be matched in the given character vector

replacement

Character string equal in length to pattern or of length one which are a replacement for matched pattern.

...

arguments to pass to gsub


Filter overlaps from matches

Description

Helper function used to identify which results from gregexpr overlap other matches and filter out shorter, overlapped results

Usage

filter_overlap(x)

Arguments

x

Matrix of gregexpr results, 4 columns, index of column matched, start of match, length of match, end of match. Produced exclusively from a worker function in mgsub


Get all matches

Description

Helper function to be used in a loop to check each pattern provided for matches

Usage

get_matches(string, pattern, i, ...)

Arguments

string

a character vector where replacements are sought

pattern

Character string to be matched in the given character vector

i

an iterator provided by a looping function

...

arguments to pass to gregexpr


Safe, multiple gsub

Description

mgsub - A safe, simultaneous, multiple global string replacement wrapper that allows access to multiple methods of specifying matches and replacements.

Usage

mgsub(string, pattern, replacement, recycle = FALSE, ...)

Arguments

string

a character vector where replacements are sought

pattern

Character string to be matched in the given character vector

replacement

Character string equal in length to pattern or of length one which are a replacement for matched pattern.

recycle

logical. should replacement be recycled if lengths differ?

...

arguments to pass to regexpr / sub

Value

Converted string.

Examples

mgsub("hey, ho", pattern = c("hey", "ho"), replacement = c("ho", "hey"))
mgsub("developer", pattern = c("e", "p"), replacement = c("p", "e"))
mgsub("The chemical Dopaziamine is fake",
      pattern = c("dopa(.*?) ", "fake"),
      replacement = c("mega\\1 ", "real"),
      ignore.case = TRUE)

Safe, multiple censoring of text strings

Description

mgsub_censor - A safe, simultaneous, multiple global string censoring (replace matches with a censoring character like '*')

Usage

mgsub_censor(
  string,
  pattern,
  censor = "*",
  split = any(nchar(censor) > 1),
  seed = NULL,
  ...
)

Arguments

string

a character vector to censor

pattern

regular expressions used to identify where to censor

censor

character to use in censoring - see details

split

if a multicharacter censor pattern is provided, should it be split to preserve original string length

seed

optional parameter to fix sampling of multicharacter censors

...

arguments to pass to regexpr / sub

Details

When censor is provided as a >1 length vector or as a multicharacter string with split = TRUE, it will be sampled to return random censoring patterns. This can be helpful if you want to create cartoonish swear censoring. If needed, the randomization can be controlled with the seed argument.

Value

Censored string.

Examples

mgsub_censor("Flowers for a friend", pattern=c("low"), censor="*")

mgsub worker

Description

The hard worker doing everything for mgsub

Usage

worker(string, pattern, replacement, ...)

Arguments

string

a character vector where replacements are sought

pattern

Character string to be matched in the given character vector

replacement

Character string equal in length to pattern or of length one which are a replacement for matched pattern.

...

arguments to pass to regexpr family