Bandpass Filtering in R Without Aggregation Using data.table and filter Packages

BY Operation on data.table without Aggregation

Introduction

In this article, we will explore a way to perform operations on a data.table in R without using loops for aggregation. This is particularly useful when working with large datasets or multiple factors that need to be filtered simultaneously.

We will start by generating a sample dataset and then walk through the process of bandpass filtering the signal using the filtfilt function from the filter package.

Generating Data

To generate our sample data, we can use the following code:

set.seed(1234)
fs = 128 # Hz sampling rate (=length of 1 sec vector)
tseq <- seq(0, .999, by = 1/fs) # t = 128 samples for 1 second 

generate_sig = function(t) {
  x <- sin(rnorm(1)*40*pi*t*.5) + 0.11*rnorm(length(t)) + sin(rnorm(1)*40*pi*t*.5) + 0.31*rnorm(length(t))  # create two random sinusoid+noise
  return(x)
}

x = generate_sig(tseq)
plot(NA, NA, xlim=c(0,128), ylim=c(-pi,pi), xlab='t', ylab='signal ampitude')
lines(x, col='red')

b = butter(2,c(1,15)*(2/fs))

xfil = filtfilt(b,x)

lines(xfil, col='black')

This code generates a signal with two sinusoids and noise, and then applies a Butterworth filter to it.

Generating Data.table Data

Next, we can generate our data.table using the following code:

val_pname=c('p1', 'p2')
val_factor1=c('left','right')
val_factor2=c('pain', 'reward', 'sham')
nb_samples = length(tseq)
col_pname = factor(rep(c(val_pname),each=length(val_factor1)*length(val_factor2)*nb_samples))
col_factor1 = factor(rep(rep(c(val_factor1),each=length(val_factor2)*nb_samples),length(val_pname)))
col_factor2 = factor(rep(rep(rep(c(val_factor2),each=nb_samples),length(val_factor1)),length(val_pname)))
col_t= rep(rep(rep(tseq,length(val_factor2)),length(val_factor1)),length(val_pname))
col_values = replicate(length(val_factor2)*length(val_factor1)*length(val_pname),generate_sig(tseq))
col_values = as.numeric(as.list(col_values))
df = data.table(participant=col_pname,factor1=col_factor1,factor2=col_factor2,t=col_t,t_idx=col_t,val=col_values)

# visualizing the whole data table
ggplot(df,aes(x=t, y=val, color=factor1))+
  geom_line()+
  facet_grid(factor2~participant)+
  theme_bw()

This code generates a data.table with multiple factors and a time dimension.

Main Issue

Our main issue is to bandpass filter the signal without using loops for aggregation. This means we need to perform the filtering operation on each value across all conditions simultaneously.

Solution

To solve this problem, we can use the filtfilt function from the filter package to apply a Butterworth filter to our data.table.

filtered_df = df[,.(val=filtfilt(b,val),t=t) by=.(participant,factor1,factor2)]

This code applies the filtering operation to each value in the data.table, without using loops for aggregation. The by argument is used to specify which factors to include in the filtering operation.

Visualizing the Solution

To visualize our solution, we can use the following code:

ggplot(filtered_df,aes(x=t, y=val2, color=factor1))+
  geom_line()+
  facet_grid(factor2~participant)+
  theme_bw()

This code generates a plot of the filtered signal, with faceting by participant and factor.

Conclusion

In this article, we explored a way to perform operations on a data.table in R without using loops for aggregation. We generated a sample dataset, applied a Butterworth filter to it, and then used the filtfilt function from the filter package to apply the filtering operation to our data.table. The result was a filtered signal with multiple faceted plots. This approach can be useful when working with large datasets or multiple factors that need to be filtered simultaneously.

Additional Resources

  • R Data Tables
  • Filter Package
  • Butterworth Filter

Last modified on 2024-08-27