Databricks notebook source exported at Tue, 28 Jun 2016 10:38:24 UTC
Scalable Data Science
Student Project Presentation by Shanshan Zhou
The html source url of this databricks notebook and its recorded Uji :
Identify hand motions from EEG recordings
by Shanshan Zhou
**Patients who have lost hand function due to amputation or neurological disabilities wake up to this reality everyday. **
- Restoring a patient’s ability to perform these basic activities of daily life with a brain-computer interface (BCI) prosthetic device would greatly increase their independence and quality of life.
- Currently, there are no realistic, affordable, or low-risk options for neurologically disabled patients to directly control external prosthetics with their brain activity.
A possible solution …
- Recorded from the human scalp, EEG signals are evoked by brain activity.
- The relationship between brain activity and EEG signals is complex and poorly understood outside of specific laboratory tests.
- Providing affordable, low-risk, non-invasive BCI devices is dependent on further advancements in interpreting EEG signals.
A tutorial on how to process EEG data
by Alexandre Barachant
http://blog.kaggle.com/2015/10/12/grasp-and-lift-eeg-winners-interview-1st-place-cat-dog/
%scala
//This allows easy embedding of publicly available information into any other notebook
//when viewing in git-book just ignore this block - you may have to manually chase the URL in frameIt("URL").
//Example usage:
// displayHTML(frameIt("https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation#Topics_in_LDA",250))
def frameIt( u:String, h:Int ) : String = {
"""<iframe
src=""""+ u+""""
width="95%" height="""" + h + """"
sandbox>
<p>
<a href="http://spark.apache.org/docs/latest/index.html">
Fallback link for browsers that, unlikely, don't support frames
</a>
</p>
</iframe>"""
}
displayHTML(frameIt("http://blog.kaggle.com/2015/10/12/grasp-and-lift-eeg-winners-interview-1st-place-cat-dog/",600))
display(dbutils.fs.ls("dbfs:/datasets/eeg/")) #data already in dbfs - see below for details
testRdd = sc.textFile('dbfs:/datasets/eeg/test.zip')
trainRdd = sc.textFile('dbfs:/datasets/eeg/train.zip')
%fs ls "dbfs:/home/ubuntu/databricks/EEG/train"
subj3_series3_events_Path = "dbfs:/home/ubuntu/databricks/EEG/train/subj3_series3_events.csv"
subj3_series4_events_Path = "dbfs:/home/ubuntu/databricks/EEG/train/subj3_series4_events.csv"
subj3_series3_data_Path = "dbfs:/home/ubuntu/databricks/EEG/train/subj3_series3_data.csv"
subj3_series4_data_Path = "dbfs:/home/ubuntu/databricks/EEG/train/subj3_series4_data.csv"
generate RDD
subj3_series3_events = sc.textFile(subj3_series3_events_Path)
subj3_series4_events = sc.textFile(subj3_series4_events_Path)
subj3_series34_events = subj3_series3_events.union(subj3_series4_events)
subj3_series3_data = sc.textFile(subj3_series3_data_Path)
subj3_series4_data = sc.textFile(subj3_series4_data_Path)
subj3_series34 = subj3_series3_data.union(subj3_series4_data)
generate DataFrame from csv file
subj3_series3_Raw_DF = sqlContext.read.format('com.databricks.spark.csv').options(header='true', inferSchema='true').load(subj3_series3_data_Path)
subj3_series3_DF = subj3_series3_Raw_DF.drop('id')
display(subj3_series3_DF)
create DF from RDD
subj3_series4_Raw_DF = subj3_series4_data.map(lambda x: (x, )).toDF()
subj3_series3_events_Raw_DF = sqlContext.read.format('com.databricks.spark.csv').options(header='true', inferSchema='true').load(subj3_series3_events_Path)
subj3_series4_events_Raw_DF = subj3_series4_events.map(lambda x: (x, )).toDF()
display(subj3_series3_events_DF)
#neural oscillation
- neural oscillationis characterized by change in signal power in specific frequency bands. These oscillations appear naturally in ongoing EEG activity, can be induced by a specific task, for example a hand movement, or mental calculus.
- For each subject, we should see a spot over the electrode C3 (Left motor cortex,corresponding to a right hand movement), and a decrease of the signal power in 10 and 20 Hz during the movement (by reference to after the movement).
subj3_series3_events_DF.filter("HandStart = 1").count()
subj3_series34.map(lambda x: (x, )).toDF().filter("HandStart = 1").count()
raw = creat_mne_raw_object(subj3_series3_DF)
# get chanel names
ch_names = list(subj3_series3_DF)
ch_names
To get data to dbfs let’s download and save.
%sh
pwd
%sh
df -h /databricks/driver
This data in http://www.math.canterbury.ac.nz/~r.sainudiin/tmp/
may be deleted in the future.
%sh
wget http://www.math.canterbury.ac.nz/~r.sainudiin/tmp/test.zip
%sh
wget http://www.math.canterbury.ac.nz/~r.sainudiin/tmp/train.zip
dbutils.fs.mkdirs("dbfs:/datasets/eeg")
dbutils.fs.cp("file:/databricks/driver/train.zip","dbfs:/datasets/eeg/")
display(dbutils.fs.ls("dbfs:/datasets/eeg/"))
testRdd = sc.textFile('dbfs:/datasets/eeg/test.zip')
trainRdd = sc.textFile('dbfs:/datasets/eeg/train.zip')
testRdd.take(5)
trainRdd.take(5)
%sh
rm train.zip test.zip
dbutils.fs.help()