talib indicators accept a data.frame and
return a data.frame—but the returned data frame contains
only the indicator columns, not the original data. This
is by design: it keeps the core API minimal and composable. In a
tidyverse pipeline, however, you usually want the indicator columns
attached to your existing data so you can keep piping.
This article builds a thin wrapper called tidy_ta() that
bridges that gap, then puts it to work in increasingly realistic
scenarios.
library(talib)
#> Loading {talib} v0.9.0
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)The gap
Piping into a talib indicator works—x is
the first argument:
BTC %>%
RSI(n = 14) %>%
tail()
#> RSI
#> 2024-12-26 01:00:00 46.48851
#> 2024-12-27 01:00:00 43.85488
#> 2024-12-28 01:00:00 45.93888
#> 2024-12-29 01:00:00 43.12301
#> 2024-12-30 01:00:00 41.47686
#> 2024-12-31 01:00:00 43.37358The result is a one-column data frame with just the RSI values. The original price data is gone. To keep both, you need to bind the indicator output back to the input.
Building tidy_ta()
The simplest version takes a data frame, passes it to an indicator, and column-binds the result:
tidy_ta <- function(.data, .f, ...) {
dplyr::bind_cols(.data, .f(.data, ...))
}Three lines, and every indicator in talib is now pipe-friendly:
BTC %>%
tidy_ta(RSI, n = 14) %>%
tail()
#> open high low close volume RSI
#> 2024-12-26 01:00:00 99356.00 99879.98 95088.99 95676.01 2872.119 46.48851
#> 2024-12-27 01:00:00 95676.00 97388.00 93320.54 94167.77 3483.965 43.85488
#> 2024-12-28 01:00:00 94167.78 95563.86 94000.00 95120.76 1333.381 45.93888
#> 2024-12-29 01:00:00 95124.01 95170.03 92831.78 93564.00 2131.462 43.12301
#> 2024-12-30 01:00:00 93564.00 94915.26 91310.00 92628.01 4069.841 41.47686
#> 2024-12-31 01:00:00 92624.41 96132.00 91884.18 93390.63 2960.960 43.37358Multi-column indicators work the same way—Bollinger Bands returns three columns, and all three get bound:
BTC %>%
tidy_ta(bollinger_bands) %>%
tail()
#> open high low close volume UpperBand
#> 2024-12-26 01:00:00 99356.00 99879.98 95088.99 95676.01 2872.119 104478.35
#> 2024-12-27 01:00:00 95676.00 97388.00 93320.54 94167.77 3483.965 100877.73
#> 2024-12-28 01:00:00 94167.78 95563.86 94000.00 95120.76 1333.381 99886.22
#> 2024-12-29 01:00:00 95124.01 95170.03 92831.78 93564.00 2131.462 99871.12
#> 2024-12-30 01:00:00 93564.00 94915.26 91310.00 92628.01 4069.841 99713.92
#> 2024-12-31 01:00:00 92624.41 96132.00 91884.18 93390.63 2960.960 99373.89
#> MiddleBand LowerBand
#> 2024-12-26 01:00:00 98217.88 91957.42
#> 2024-12-27 01:00:00 97020.16 93162.59
#> 2024-12-28 01:00:00 96516.01 93145.81
#> 2024-12-29 01:00:00 96134.41 92397.71
#> 2024-12-30 01:00:00 95620.42 91526.92
#> 2024-12-31 01:00:00 95236.42 91098.95Chaining multiple indicators composes naturally:
BTC %>%
tidy_ta(RSI, n = 14) %>%
tidy_ta(bollinger_bands) %>%
tidy_ta(MACD) %>%
tail()
#> open high low close volume RSI
#> 2024-12-26 01:00:00 99356.00 99879.98 95088.99 95676.01 2872.119 46.48851
#> 2024-12-27 01:00:00 95676.00 97388.00 93320.54 94167.77 3483.965 43.85488
#> 2024-12-28 01:00:00 94167.78 95563.86 94000.00 95120.76 1333.381 45.93888
#> 2024-12-29 01:00:00 95124.01 95170.03 92831.78 93564.00 2131.462 43.12301
#> 2024-12-30 01:00:00 93564.00 94915.26 91310.00 92628.01 4069.841 41.47686
#> 2024-12-31 01:00:00 92624.41 96132.00 91884.18 93390.63 2960.960 43.37358
#> UpperBand MiddleBand LowerBand MACD MACDSignal
#> 2024-12-26 01:00:00 104478.35 98217.88 91957.42 608.68287 1590.9032
#> 2024-12-27 01:00:00 100877.73 97020.16 93162.59 243.07106 1321.3368
#> 2024-12-28 01:00:00 99886.22 96516.01 93145.81 29.87501 1063.0444
#> 2024-12-29 01:00:00 99871.12 96134.41 92397.71 -261.68536 798.0985
#> 2024-12-30 01:00:00 99713.92 95620.42 91526.92 -561.79956 526.1189
#> 2024-12-31 01:00:00 99373.89 95236.42 91098.95 -729.69370 274.9564
#> MACDHist
#> 2024-12-26 01:00:00 -982.2204
#> 2024-12-27 01:00:00 -1078.2657
#> 2024-12-28 01:00:00 -1033.1694
#> 2024-12-29 01:00:00 -1059.7838
#> 2024-12-30 01:00:00 -1087.9184
#> 2024-12-31 01:00:00 -1004.6501Handling column-name collisions
If you add two SMAs with different periods, both return a column
named SMA and bind_cols() disambiguates with
ugly suffixes like SMA...6. A .suffix
parameter fixes this:
tidy_ta <- function(.data, .f, ..., .suffix = NULL) {
result <- .f(.data, ...)
if (!is.null(.suffix)) {
colnames(result) <- paste(colnames(result), .suffix, sep = "_")
}
dplyr::bind_cols(.data, result)
}Now each indicator gets a clear name:
BTC %>%
tidy_ta(SMA, n = 10, .suffix = "10") %>%
tidy_ta(SMA, n = 20, .suffix = "20") %>%
tail()
#> open high low close volume SMA_10
#> 2024-12-26 01:00:00 99356.00 99879.98 95088.99 95676.01 2872.119 98217.88
#> 2024-12-27 01:00:00 95676.00 97388.00 93320.54 94167.77 3483.965 97020.16
#> 2024-12-28 01:00:00 94167.78 95563.86 94000.00 95120.76 1333.381 96516.01
#> 2024-12-29 01:00:00 95124.01 95170.03 92831.78 93564.00 2131.462 96134.41
#> 2024-12-30 01:00:00 93564.00 94915.26 91310.00 92628.01 4069.841 95620.42
#> 2024-12-31 01:00:00 92624.41 96132.00 91884.18 93390.63 2960.960 95236.42
#> SMA_20
#> 2024-12-26 01:00:00 99594.80
#> 2024-12-27 01:00:00 99305.69
#> 2024-12-28 01:00:00 99002.74
#> 2024-12-29 01:00:00 98814.94
#> 2024-12-30 01:00:00 98613.30
#> 2024-12-31 01:00:00 98223.05The cols argument is forwarded through ...,
so column remapping still works:
BTC %>%
tidy_ta(RSI, cols = ~high, n = 14) %>%
tail()
#> open high low close volume RSI
#> 2024-12-26 01:00:00 99356.00 99879.98 95088.99 95676.01 2872.119 50.88773
#> 2024-12-27 01:00:00 95676.00 97388.00 93320.54 94167.77 3483.965 45.83913
#> 2024-12-28 01:00:00 94167.78 95563.86 94000.00 95120.76 1333.381 42.51414
#> 2024-12-29 01:00:00 95124.01 95170.03 92831.78 93564.00 2131.462 41.80903
#> 2024-12-30 01:00:00 93564.00 94915.26 91310.00 92628.01 4069.841 41.33147
#> 2024-12-31 01:00:00 92624.41 96132.00 91884.18 93390.63 2960.960 44.58689This is the complete wrapper. The rest of the article uses it as-is.
Grouped operations across assets
A common task is computing the same indicator across multiple
tickers. Stack the data, nest() by ticker, apply
tidy_ta() inside each group, and unnest():
assets <- bind_rows(
BTC %>% as_tibble(rownames = "date") %>% mutate(ticker = "BTC"),
SPY %>% as_tibble(rownames = "date") %>% mutate(ticker = "SPY"),
NVDA %>% as_tibble(rownames = "date") %>% mutate(ticker = "NVDA")
)
assets %>%
nest(.by = ticker) %>%
mutate(data = lapply(data, tidy_ta, RSI, n = 14)) %>%
unnest(data) %>%
select(ticker, date, close, RSI) %>%
filter(!is.na(RSI)) %>%
slice_tail(n = 3, by = ticker)
#> # A tibble: 9 × 4
#> ticker date close RSI
#> <chr> <chr> <dbl> <dbl>
#> 1 BTC 2024-12-29 01:00:00 93564 43.1
#> 2 BTC 2024-12-30 01:00:00 92628. 41.5
#> 3 BTC 2024-12-31 01:00:00 93391. 43.4
#> 4 SPY 499 601. 54.6
#> 5 SPY 500 595. 47.8
#> 6 SPY 501 588. 41.9
#> 7 NVDA 499 49.4 57.8
#> 8 NVDA 500 49.5 58.3
#> 9 NVDA 501 49.5 58.3Because tidy_ta() returns the full enriched data frame,
unnest() restores everything in one step. This scales to
multiple indicators by chaining inside the lapply():
assets %>%
nest(.by = ticker) %>%
mutate(data = lapply(data, function(d) {
d %>%
tidy_ta(RSI, n = 14) %>%
tidy_ta(bollinger_bands)
})) %>%
unnest(data) %>%
select(ticker, date, close, RSI, UpperBand, MiddleBand, LowerBand) %>%
filter(!is.na(RSI)) %>%
slice_tail(n = 3, by = ticker)
#> # A tibble: 9 × 7
#> ticker date close RSI UpperBand MiddleBand LowerBand
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 BTC 2024-12-29 01:00:00 93564 43.1 99871. 96134. 92398.
#> 2 BTC 2024-12-30 01:00:00 92628. 41.5 99714. 95620. 91527.
#> 3 BTC 2024-12-31 01:00:00 93391. 43.4 99374. 95236. 91099.
#> 4 SPY 499 601. 54.6 613. 598. 583.
#> 5 SPY 500 595. 47.8 611. 597. 583.
#> 6 SPY 501 588. 41.9 610. 596. 581.
#> 7 NVDA 499 49.4 57.8 50.2 49.0 47.7
#> 8 NVDA 500 49.5 58.3 50.2 49.1 48.0
#> 9 NVDA 501 49.5 58.3 50.3 49.2 48.2Putting it all together
A complete pipeline: enrich a multi-asset dataset, flag RSI signals, and find the most recent event per asset.
assets %>%
nest(.by = ticker) %>%
mutate(data = lapply(data, tidy_ta, RSI, n = 14)) %>%
unnest(data) %>%
filter(!is.na(RSI)) %>%
mutate(
signal = case_when(
RSI > 70 ~ "overbought",
RSI < 30 ~ "oversold"
)
) %>%
filter(!is.na(signal)) %>%
slice_tail(n = 1, by = c(ticker, signal)) %>%
select(ticker, date, close, RSI, signal) %>%
arrange(ticker, signal)
#> # A tibble: 6 × 5
#> ticker date close RSI signal
#> <chr> <chr> <dbl> <dbl> <chr>
#> 1 BTC 2024-11-24 01:00:00 98016. 78.2 overbought
#> 2 BTC 2024-08-05 02:00:00 54045. 26.7 oversold
#> 3 NVDA 474 50.4 70.0 overbought
#> 4 NVDA 184 12.2 28.9 oversold
#> 5 SPY 484 608. 70.7 overbought
#> 6 SPY 207 411. 29.1 oversoldSummary
The entire wrapper is six lines:
tidy_ta <- function(.data, .f, ..., .suffix = NULL) {
result <- .f(.data, ...)
if (!is.null(.suffix)) {
colnames(result) <- paste(colnames(result), .suffix, sep = "_")
}
dplyr::bind_cols(.data, result)
}It works because talib indicators already follow the
key convention: data frame in, data frame out, with row counts and row
names preserved. tidy_ta() just bridges the last
mile—binding the result back to the input so the pipeline keeps
flowing.
The pattern is not specific to talib. Any function that takes a data frame and returns a same-length data frame can be wrapped the same way.
