Title: | Balamuta Miscellaneous |
---|---|
Description: | Set of common functions used for manipulating colors, detecting and interacting with 'RStudio', modeling, formatting, determining users' operating system, feature scaling, and more! |
Authors: | James Balamuta [aut, cre, cph] |
Maintainer: | James Balamuta <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.2 |
Built: | 2024-11-08 02:40:09 UTC |
Source: | https://github.com/coatless-rpkg/jjb |
Set of common functions used for manipulating colors, detecting and interacting with 'RStudio', modeling, formatting, determining users' operating system, feature scaling, and more!
Maintainer: James Balamuta [email protected] (ORCID) [copyright holder]
Useful links:
Report bugs at https://github.com/coatless-rpkg/jjb/issues
Calculates the accuracy of the model by taking the mean of the number of times
the truth, , equals the predicted,
.
acc(y, yhat)
acc(y, yhat)
y |
A |
yhat |
A |
The accuracy of the classification in numeric
form.
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = round(runif(n)) yhat = round(runif(n)) # Compute o = acc(y, yhat)
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = round(runif(n)) yhat = round(runif(n)) # Compute o = acc(y, yhat)
Converts temperature recorded in Celsius to Fahrenheit.
celsius_to_fahrenheit(t_celsius)
celsius_to_fahrenheit(t_celsius)
t_celsius |
Temperature recorded in Celsius. |
A numeric
vector.
celsius_to_fahrenheit(33) celsius_to_fahrenheit(0)
celsius_to_fahrenheit(33) celsius_to_fahrenheit(0)
Converts temperature recorded in Celsius to Kelvin.
celsius_to_kelvin(t_celsius)
celsius_to_kelvin(t_celsius)
t_celsius |
Temperature recorded in Celsius. |
A numeric
vector.
celsius_to_kelvin(92) celsius_to_kelvin(32)
celsius_to_kelvin(92) celsius_to_kelvin(32)
Returns the character at location i inside the string.
char_at(x, index)
char_at(x, index)
x |
A |
index |
An |
A character vector
of length index
.
James J Balamuta
# Example string s = "statistics" # Single character char_at(s, 1) # Vectorized position char_at(s, c(2, 3))
# Example string s = "statistics" # Single character char_at(s, 1) # Vectorized position char_at(s, c(2, 3))
Takes a default matrix and embeds circles within the matrix.
circle_matrix(m, n, x.center, y.center, r, f = 1)
circle_matrix(m, n, x.center, y.center, r, f = 1)
m |
A |
n |
A |
x.center |
A |
y.center |
A |
r |
A |
f |
A |
A matrix
with circles imprinted within its dimensions.
James Balamuta
# Generate a basic circle matrix circle_matrix(10, 10, 3, 4, 2) # Generate two circles within the matrix circle_matrix(10, 20, c(3,6), c(4,6), c(2,2)) # Different fills circle_matrix(10, 20, c(3,6), c(4,6), c(2,2), f = c(1,2))
# Generate a basic circle matrix circle_matrix(10, 10, 3, 4, 2) # Generate two circles within the matrix circle_matrix(10, 20, c(3,6), c(4,6), c(2,2)) # Different fills circle_matrix(10, 20, c(3,6), c(4,6), c(2,2), f = c(1,2))
data.frame
All at once conversion of a data.frame
from current column types to
alternates.Convert Multiple Columns of a data.frame
All at once conversion of a data.frame
from current column types to
alternates.
convert_cols(d, cast)
convert_cols(d, cast)
d |
A |
cast |
A |
A data.frame
with converted column types.
n = 100 st = sample(LETTERS, n, replace = TRUE) sr = sample(letters, n, replace = TRUE) num = rnorm(n) d = data.frame(x = st, y = num, z = sr, stringsAsFactors = FALSE) # Convert all columns o = convert_cols(d,c("f", "c", "f")) # Convert a subset d[, c(1, 3)] = convert_cols(d[, c(1, 3)], c("f", "f"))
n = 100 st = sample(LETTERS, n, replace = TRUE) sr = sample(letters, n, replace = TRUE) num = rnorm(n) d = data.frame(x = st, y = num, z = sr, stringsAsFactors = FALSE) # Convert all columns o = convert_cols(d,c("f", "c", "f")) # Convert a subset d[, c(1, 3)] = convert_cols(d[, c(1, 3)], c("f", "f"))
Checks to see if the user is in RStudio. If so, then it changes the device to a popup window.
external_graphs(ext = TRUE)
external_graphs(ext = TRUE)
ext |
A |
Depending on the operating system, the default drivers attempted to be used are:
OS X: quartz()
Linux: x11()
Windows: windows()
Note, this setting is not permanent. Thus, the behavioral change will last until the end of the session.
Also, the active graphing environment will be killed. As a result, any graphs that are open will be deleted. You will have to regraph them.
There is no return value. Instead, once finished, the function will cause a side effect to occur. See details for more.
James Balamuta
# Turn on external graphs external_graphs() # Turn off external graphs external_graphs(FALSE)
# Turn on external graphs external_graphs() # Turn off external graphs external_graphs(FALSE)
Converts temperature recorded in Fahrenheit to Celsius.
fahrenheit_to_celsius(t_fahrenheit)
fahrenheit_to_celsius(t_fahrenheit)
t_fahrenheit |
Temperature recorded in Fahrenheit. |
A numeric
vector.
fahrenheit_to_celsius(92) fahrenheit_to_celsius(32)
fahrenheit_to_celsius(92) fahrenheit_to_celsius(32)
Converts temperature recorded in Fahrenheit to Kelvin.
fahrenheit_to_kelvin(t_fahrenheit)
fahrenheit_to_kelvin(t_fahrenheit)
t_fahrenheit |
Temperature recorded in Fahrenheit. |
A numeric
vector.
fahrenheit_to_kelvin(92) fahrenheit_to_kelvin(32)
fahrenheit_to_kelvin(92) fahrenheit_to_kelvin(32)
Scale features in a datasets.
feature_rescale(x, x_min = NULL, x_max = NULL) feature_derescale(x_rescaled, x_min, x_max) feature_norm(x, x_norm = NULL) feature_denorm(x_norm_std, x_norm = NULL) feature_standardize(x, x_mean = NULL, x_sd = NULL) feature_destandardize(x_std, x_mean = NULL, x_sd = NULL)
feature_rescale(x, x_min = NULL, x_max = NULL) feature_derescale(x_rescaled, x_min, x_max) feature_norm(x, x_norm = NULL) feature_denorm(x_norm_std, x_norm = NULL) feature_standardize(x, x_mean = NULL, x_sd = NULL) feature_destandardize(x_std, x_mean = NULL, x_sd = NULL)
x |
Numeric values |
x_min |
Minimum non-normalized numeric value |
x_max |
Maximum non-normalized numeric value |
x_rescaled |
Rescaled values of |
x_norm |
Euclidean norm of x |
x_norm_std |
Euclidean vector of normalized |
x_mean |
Mean of |
x_sd |
Standard Deviation of |
x_std |
Z-transformed |
The following functions provide a means to either scale features or to descale the features and return them to normal. These functions are ideal for working with optimizers.
Feature Scale | Feature Descale |
feature_rescale | feature_derescale |
feature_norm | feature_denorm |
feature_standardize | feature_destandardize |
A numeric
vector.
Convert the original data to
:
To move from the rescaled value to the original value
use:
Convert the original data to
:
To move from the standardized value to the original value
use:
Convert the original data to
:
To move from the normalized value to the original value
use:
James Balamuta
# Rescaling Features temperatures = c(94.2, 88.1, 32, 0) temp_min = min(temperatures) temp_max = max(temperatures) temperatures_norm = feature_rescale(temp_min, temp_max) temperatures_denorm = feature_derescale(temperatures_norm, temp_min, temp_max) all.equal(temperatures, temperatures_denorm) # Norming Features x = 1:10 x_norm = sqrt(sum(x^2)) x_norm_std = feature_norm(x, x_norm) x_recover = feature_denorm(x_norm_std, x_norm) all.equal(x, x_recover) # Standardizing Features x = 1:10 x_mean = mean(x) x_sd = sd(x) x_std = feature_standardize(x, x_mean, x_sd) x_recovery = feature_destandardize(x, x_mean, x_sd) all.equal(x, x_recovery)
# Rescaling Features temperatures = c(94.2, 88.1, 32, 0) temp_min = min(temperatures) temp_max = max(temperatures) temperatures_norm = feature_rescale(temp_min, temp_max) temperatures_denorm = feature_derescale(temperatures_norm, temp_min, temp_max) all.equal(temperatures, temperatures_denorm) # Norming Features x = 1:10 x_norm = sqrt(sum(x^2)) x_norm_std = feature_norm(x, x_norm) x_recover = feature_denorm(x_norm_std, x_norm) all.equal(x, x_recover) # Standardizing Features x = 1:10 x_mean = mean(x) x_sd = sd(x) x_std = feature_standardize(x, x_mean, x_sd) x_recovery = feature_destandardize(x, x_mean, x_sd) all.equal(x, x_recovery)
Determine the floor and cap of a numeric variable by taking quantiles. Using the quantiles, values in the data found to be lower or higher than the floor or cap are replaced.
floor_and_cap(x, probs = c(0.025, 0.975))
floor_and_cap(x, probs = c(0.025, 0.975))
x |
A |
probs |
A |
A vector
with the values floored and capped.
# One case version n = 100 x = rnorm(n) x[n - 1] = -99999 x[n] = 10000 y = floor_and_cap(x) # Dataset example d = data.frame(x, y = rnorm(n)) o = sapply(d, floor_and_cap)
# One case version n = 100 x = rnorm(n) x[n - 1] = -99999 x[n] = 10000 y = floor_and_cap(x) # Dataset example d = data.frame(x, y = rnorm(n)) o = sapply(d, floor_and_cap)
This is a helper function for rgb_to_hex
. This function takes a
single R, G, or B numeric value and converts it to hex.
int_to_hex(n)
int_to_hex(n)
n |
An |
A string
of length 2.
int_to_hex(22)
int_to_hex(22)
Detects whether R is open in RStudio.
is_rstudio()
is_rstudio()
A logical
value that indicates whether R is open in RStudio.
James Balamuta
is_rstudio()
is_rstudio()
Checks whether the submitted value is an integer
is_whole(x)
is_whole(x)
x |
A |
A boolean
value indicating whether the value is an integer
or not.
James Balamuta
is_whole(2.3) is_whole(4) is_whole(c(1,2,3)) is_whole(c(.4,.5,.6)) is_whole(c(7,.8,9))
is_whole(2.3) is_whole(4) is_whole(c(1,2,3)) is_whole(c(.4,.5,.6)) is_whole(c(7,.8,9))
Performs a check to determine the OS
is_windows() is_macos() is_linux() is_sun()
is_windows() is_macos() is_linux() is_sun()
Either TRUE
or FALSE
James Joseph Balamuta
Converts temperature recorded in Kelvin to Celsius.
kelvin_to_celsius(t_kelvin)
kelvin_to_celsius(t_kelvin)
t_kelvin |
Temperature recorded in Kelvin. |
A numeric
vector.
kelvin_to_celsius(92) kelvin_to_celsius(32)
kelvin_to_celsius(92) kelvin_to_celsius(32)
Converts temperature recorded in Celsius to Kelvin.
kelvin_to_fahrenheit(t_kelvin)
kelvin_to_fahrenheit(t_kelvin)
t_kelvin |
Temperature recorded in Kelvin. |
A numeric
vector.
kelvin_to_fahrenheit(92) kelvin_to_fahrenheit(32)
kelvin_to_fahrenheit(92) kelvin_to_fahrenheit(32)
Provides a lagging mechanism for vector data.
lagged(x, lag = 1)
lagged(x, lag = 1)
x |
A |
lag |
An |
A vector
with lagged values and NA
s.
James Balamuta
x = rnorm(10) lagged(x, 2)
x = rnorm(10) lagged(x, 2)
Obtain the Maximum or Minimum n elements from a vector.
max_n(x, n = 1L) min_n(x, n = 1)
max_n(x, n = 1L) min_n(x, n = 1)
x |
Data vector |
n |
Number of observations to select |
The underlying function sorts the data using base::sort()
and then extracts
out the appropriate n-back or n-forward values.
As a result of the sorting procedure, this is an inefficient function.
A vector
containing the maximum/minimum of elements.
x = 1:10 # Defaults to traditional max # This is more costly to compute than using the regular max function. max_n(x) # Retrieve top two observations (highest first) max_n(x, 2) # Missing values have no effect on the sorting procedure x[9] = NA max_n(x, 3) # Defaults to traditional min. # This is more costly to compute than using the regular min function. min_n(x) min(x) # Retrieve bottom two observations (lowest first) min_n(x, 2) # Missing values have no effect on the sorting procedure x[2] = NA min_n(x, 3)
x = 1:10 # Defaults to traditional max # This is more costly to compute than using the regular max function. max_n(x) # Retrieve top two observations (highest first) max_n(x, 2) # Missing values have no effect on the sorting procedure x[9] = NA max_n(x, 3) # Defaults to traditional min. # This is more costly to compute than using the regular min function. min_n(x) min(x) # Retrieve bottom two observations (lowest first) min_n(x, 2) # Missing values have no effect on the sorting procedure x[2] = NA min_n(x, 3)
Create a directory using either a relative path or an absolute path.
mkdir(dir, r = TRUE)
mkdir(dir, r = TRUE)
dir |
A |
r |
A |
New directory on file system
James Balamuta
# Make directory from working directory mkdir("toad") ## This assumes the computer is on Windows and the C drive exists. # Make directory from absolute path mkdir("C:/path/to/dir/toad")
# Make directory from working directory mkdir("toad") ## This assumes the computer is on Windows and the C drive exists. # Make directory from absolute path mkdir("C:/path/to/dir/toad")
Calculates the mean square of the model by taking the mean of the
sum of squares between the truth, , and the predicted,
at each observation
.
mse(y, yhat)
mse(y, yhat)
y |
A |
yhat |
A |
The equation for MSE is:
The MSE in numeric
form.
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = rnorm(n) yhat = rnorm(n, 0.5) # Compute o = mse(y, yhat)
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = rnorm(n) yhat = rnorm(n, 0.5) # Compute o = mse(y, yhat)
Computes the proportion of data that is missing in a given data set.
na_prop_overall(x) na_prop_by_variable(x) na_prop_by_observation(x) na_count_overall(x) na_count_by_variable(x) na_count_by_observation(x)
na_prop_overall(x) na_prop_by_variable(x) na_prop_by_observation(x) na_count_overall(x) na_count_by_variable(x) na_count_by_observation(x)
x |
A vector of length |
Overall: a single numeric value between [0, 1]
or a count between [0, N]
.
Variable: different numeric values between
[0, 1]
or counts between [0, N]
.
Observation: different numeric values between
[0, 1]
or counts between [0, P]
.
# By vector x = c(1, 2, NA, 4) na_prop_overall(x) na_count_overall(x) # By Data Frame missing_df = data.frame( a = c(1, 2, NA, 4), b = c(3, NA, 2, NA) ) # Proportion na_prop_overall(missing_df) na_prop_by_variable(missing_df) na_prop_by_observation(missing_df) # Counts na_count_overall(missing_df) na_count_by_variable(missing_df) na_count_by_observation(missing_df)
# By vector x = c(1, 2, NA, 4) na_prop_overall(x) na_count_overall(x) # By Data Frame missing_df = data.frame( a = c(1, 2, NA, 4), b = c(3, NA, 2, NA) ) # Proportion na_prop_overall(missing_df) na_prop_by_variable(missing_df) na_prop_by_observation(missing_df) # Counts na_count_overall(missing_df) na_count_by_variable(missing_df) na_count_by_observation(missing_df)
Add zeros before start of the number
pad_number(x)
pad_number(x)
x |
A |
A character vector
that is padded to the length of the
maximum entry.
James Balamuta
# Padding applied pad_number(8:10) # No padding applied pad_number(2:3) # Pads non-negative number with 0. # This needs to be improved slightly... pad_number(-1:1)
# Padding applied pad_number(8:10) # No padding applied pad_number(2:3) # Pads non-negative number with 0. # This needs to be improved slightly... pad_number(-1:1)
Mandates the presence of an operating system
require_linux() require_windows() require_macos() require_sun()
require_linux() require_windows() require_macos() require_sun()
If any of these functions are called on the wrong operating system. A stop error is triggered and the function will fail.
James Joseph Balamuta
This function converts an RGB value to the hexadecimal numbering system.
rgb_to_hex(R, G, B, pound = TRUE)
rgb_to_hex(R, G, B, pound = TRUE)
R |
A |
G |
A |
B |
A |
pound |
A |
A string
containing the hexadecimal information.
# Hexadecimal with pound sign rgb_to_hex(255,255,255) # Heaxadecimal without pound sign rgb_to_hex(255,255,255,FALSE)
# Hexadecimal with pound sign rgb_to_hex(255,255,255) # Heaxadecimal without pound sign rgb_to_hex(255,255,255,FALSE)
Calculates the root mean square of the model by taking the square root of
mean of the sum of squares between the truth, , and the predicted,
at each observation
.
rmse(y, yhat)
rmse(y, yhat)
y |
A |
yhat |
A |
The formula for RMSE is:
The RMSE in numeric
form
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = rnorm(n) yhat = rnorm(n, 0.5) # Compute o = mse(y, yhat)
# Set seed for reproducibility set.seed(100) # Generate data n = 1e2 y = rnorm(n) yhat = rnorm(n, 0.5) # Compute o = mse(y, yhat)
The function shades or darkens an RGB value by adding black to the values.
shade(rgb_value, shade_factor = 0.1)
shade(rgb_value, shade_factor = 0.1)
rgb_value |
A |
shade_factor |
A |
A matrix
with dimensions .
shade(c(22, 150, 230), shade_factor = 0.5)
shade(c(22, 150, 230), shade_factor = 0.5)
System Architecture
system_arch()
system_arch()
Either "x64" or "x32"
Provides the default operating system graphics utility
system_graphic_driver()
system_graphic_driver()
A string
that is either:
"quartz"
: if on MacOS
"windows"
: if on Windows
"x11"
: if on Linux or Solaris
James Balamuta
# Returns a string depending on test platform system_graphic_driver()
# Returns a string depending on test platform system_graphic_driver()
The function tints or lightens an RGB value by adding white to the values.
tint(rgb_value, tint_factor = 0.2)
tint(rgb_value, tint_factor = 0.2)
rgb_value |
A |
tint_factor |
A |
A matrix
with dimensions .
tint(c(22, 150, 230), tint_factor = 0.5)
tint(c(22, 150, 230), tint_factor = 0.5)
Calculates and returns the trace of a square matrix.
tr(x)
tr(x)
x |
A |
A matrix
with circles imprinted within its dimensions.
James Balamuta
# I_2 matrix tr(diag(2))
# I_2 matrix tr(diag(2))
Takes a string, forces characters to lower case, then removes punctuation and switch spaces to - instead of _
url_title(st)
url_title(st)
st |
A |
A string
with the aforementioned modifications.
James Balamuta
url_title("My Name is Jaime!")
url_title("My Name is Jaime!")