In this tutorial, I will show you how to download the data from the ERP Core website and how to process that data using EEGlab ad ERPlab packages for MATLAB. The ERP Core website contains all the information about the experiment, link to the paper and link to the data.
About ERP Core website
The ERP CORE is an online resource with optimized paradigms, experiment control scripts, example data from 40 participants, data processing pipelines and analysis scripts, and a broad set of results for 7 different ERP components obtained from 6 different ERP paradigms:
- N170 (Face Perception Paradigm)
- MMN (Passive Auditory Oddball Paradigm)
- N2pc (Simple Visual Search Paradigm)
- N400 (Word Pair Judgement Paradigm)
- P3b (Active Visual Oddball Paradigm)
- LRP and ERN (Flankers Paradigm)
The experiment control scripts, data, and data analysis scripts can be downloaded here.
MATLAB, EEGlab and ERPlab
MATLAB which is the programming language that is widely used for analysing neuroimaging data. EEGlab and ERPlab are additional packages that require MATLAB to run. While MATLAB is a commercial software solution that require a license to use, both packages are available to download for free.
Downloading and viewing the data
On the ERP Core website you can find different components they analysed (see above), but for our demo, we will look at the MMN (mismatch negativity). So, we click on MMN All data and Scripts
and download it (download as zip). This is going to take a while because it’s a big file.
Let's have a look at these files in more detail to check what they are:
- First, have a look at the
README
document which contains all the information about what this folder contains, how to contact the researchers, and info about their analysis pipeline. - In the
MMN
folder there is an individual folder for each subject. Down there is a subfolder calledEEG_ERP_Processing
where you will find some useful scripts that you can use for data processing and analysis. - Have a look at one of them. With EEGlab and ERPlab (and some other software too) there are usually two ways of doing your processing and analysis. (1) You can do it entirely through custom code as they did it there or (2) you can use GUI to do that (i.e., click buttons) – which is what we are going to do now. These scripts represent the same steps as the ones we will cover in this titorial.
Once we downloaded and previewed our data, we can begin processing our data with EEGlab. First, we need to open MATLAB. Then, open EEGlab by typing eeglab
in the command window. Some warnings show up but you can ignore them for now.
Let’s have a look at the data. To do that, click on File > Load existing dataset
and for this demo let’s select Subject 1
from the MMN dataset – select file called 1_MMN.set
.
Once we loaded the dataset we can see some information about it:
- Number of channels – here 33. There are 30 active electrodes and 3 channels for recording ocular artefacts
- Sample rate – the number of samples taken per second of the EEG recording
To look at our data go to Plot > Channel data (scroll)
. Change some settings by going to Display > Remove DC offset
. Now you can see our raw EEG data with each channel plotted separately as a single row with channel name on the y-axis and time on x-axis.
1. Defining channel locations
First, we must define channel locations. This lets the EEGlab know which channels correspond to which locations on the scalp. This step is relevant because there are different electrode setups, so you always need to check that your dataset is annotated correctly.
To do this, go to Edit > Channel locations
. Confirm by clicking OK
. This will open a new window, where you can preview the different channels (e.g., FP1, FC3) and where they are located on the scalp (i.e., their coordinates). If that looks correct, confirm by clicking OK
.
2. Down-sampling
Next, we're going to down-sample our data. This will reduce the size of our data and speed up the processing. To do this, go to Tools > Change sampling rate
. We can enter the sampling rate we want. Let's put 256
and click OK
.
Now, the EEGlab will give us an option to rename the processed dataset so we can save it it the EEGlab and return to it later if we need to. Once we save it, when you click on Datasets
, you can see the original dataset and resampled dataset we just created. This is also reflected in the main blue window with dataset description (see sampling rate and name of the current dataset).
3. Re-referencing
Now, we will re-reference our data. Typically, researchers use earlobe or mastoid electrodes as references, but some other options are also available (e.g., average of all electrodes). Your references location will depend on your experimental design. Here, they used channels P9
and P10
which are just adjacent to the mastoids.
To re-reference the data, go to Tools > Re-reference the data
and tick the option called Re-reference data to channel(s):
. Then, click on the ...
, select P9
and P10
from the dropdown menu, and then click OK
. The selected reference electrodes should now be displayed in the tab window. To confirm, click OK. Then, you can save the re-referenced dataset. To do that, change the name of the dataset to 1_MMN_rereferenced
and click OK
.
4. Defining additional channels
The next step is to create new EEG channels that are going to help us to remove eyeblinks and eye movements from the data. Whether you would need to perform this step depends on your experimental setup. Here, additional electrodes were placed above and next to the right eye, so we can do that. To define new channels, go to ERPLAB > EEG channel operations
.
We can find horizontal eye movements by taking the difference between the electrode placed outside the left eye and the right eye (i.e., channels 29 and 30). For eye blinks, we can subtract the electrode just above the eye (i.e., electrode F2, channel 15) and channel 31 just below the right eye.
Use the following formulas to get these additional channels:
ch32 = ch29 - ch30 Label (uncorrected) HEOG
ch33 = ch15 - ch31 Label (uncorrected) VEOG
To confirm, click RUN
. Then, save the dataset with additional channels as 1_MMN_rereferenced_chop
.
5. Filtering: High-pass filter
Next, we’re going to filter our dataset to remove unwanted noise. For now, we going to apply the high-pass filter. Go to ERPLAB > Filter & Frequency Tools > Filters for EEG data
. This will open a new window with filters, which gives us a lot of options on how we can define our filters.
Click the High-pass
button to start defining filter settings. Set the frequency to 0.1 Hz
. You can then change the steepness of the filter by changing the roll-off. To explore, click to change it and see what happens. For high-pass filter, we want 12 db per octave
roll-off.
When happy with your filter, click APPLY
. Save the filtered data as 1_MMN_filtered
.
6. Removing eye movements with ICA
Next, we will run independent component analysis (ICA) to remove eye blinks and eye movements. Go to Tools > Decompose data with ICA
, and select the desired algorithm. We’re going to use the runica
a default algorithm, but other more fancy options are also available.
Click on Channels
to select on which channels we want to run our ICA on. Here, we select all channels except got the (uncorrected) HEOG
and (uncorrected) VEOG
and click OK
. Then, click OK
again. ICA takes a good few minutes, so take a break and come back when processing is ready.
7. Plotting component activations
Once the ICA finished, we can have a look at which components relate to eye blinks and eye movements.
To plot individual components, go to Plot > Component activations (scroll)
. Change the scale of your data from 9.934 to 30 (tab next to +/-) and select Reject
button.
This plot looks similar to the one we’ve seen before, but here, instead of each channel on y-axis we see each component.
It looks like Component 1
corresponds to eye blinks, because we can clearly see very distinct peaks in the data. Component 8
might represent our horizontal eye movements, because it presents with a square wave form in the data. When satisfied with your choice, close the plot.
8. Rejecting selected components
Now we will reject those components. To do this, go to Tools > Inspect/label components by map
. Select 1:10
to see topoplots. Click on 1
as this is the component we identified as our eyeblink. To reject this component, change ACCEPT
to REJECT
. The same for eye movement component.
Now go to Tools > Remove components from data
. Because we already flagged our components for rejection, eeglab knows we want to remove them, so we click YES
.
Note: Before you click, you can plot the data.
After removing the components, our uncorrected data will be displayed in blue, and corrected data in red. It looks like eye blinks and eye movements were successfully removed. Scroll through data a bit to make sure it worked. Then, click Accept
and save data as 1_MMN_filtered pruned with ICA
.
9. Epoching
Now, we are ready to epoch our data. This step divides our continuous data into discrete epochs corresponding to the events that are time locked to the relevant events in our stimuli presentation.
First create event list. Go to ERPLAB > EventList > Upload EventList from txt file
. Assign bins and conditions from our experiment. This is given to us in a txt file called BDF_MMN.txt
in the folder EEG_ERP_Processing
. When we open this file, we can see that Bin 1 corresponds to our deviant events and this is labelled by the number 70. Bin 2 corresponds to our standard events and is marked by the number 80.
Go to ERPLAB > Assign bins
and select that file and then click RUN
and OK
.
To epoch our data, go to ERPLAB > Extract bin-based epochs
. New window shows up asking us at which times we want to epoch our data. For this experiment, we will set the timings from -200 ms before stimulus onset to 800 ms after stimulus.
We also apply baseline correction by taking the mean of pre-stimulus interval -200 ms to 0 ms as our baseline. Select Pre
and click RUN
.
10. Removing remaining artefacts
Now, we need to remove any remaining artefacts from our data. Go to ERPLAB > Artefact detection in epoched data > Simple voltage threshold
. This gives you a lot of different options on what you can do but here we just go with simple voltage threshold.
New window opens. Set voltage limits on the right to -100 100. Select channels 1:28. That just means that we will reject trials above or below that threshold (mV) for selected channels. Then click ACCEPT
.
Plot and new window to save shows up. But before we do that, let’s check our command window. Here you can see for each bin the proportion of accepted and rejected channels and you notice that the number of trials we rejected is quite high. This possibly means that there are some bad channels in our data. So let’s close the plot and cancel save.
Dealing with bad channels
For this subject there are actually 3 bad channels (noisy!). To fix that, we can go to Tools > Interpolate electrodes
. New window shows up. Here we can select our channels by clicking "Select from data channels”
. Choose the following channels: C5, F4, and F8. Now we can try artefact rejection again.
Note: Here, I told you which channels are bad. In real life, you have to figure this out yourself by either scrolling through the data (!) or relying on notes you made during testing.
11. Filtering: Low-pass filter
We apply low-pass filter to remove high-frequency noise from our data. Go to ERPLAB > Filter & Frequency Tools > Filters for EEG data
and we set low-pass at 20 Hz and roll-off of 48 db per octave.
12. Computing average ERPs
To compute ERPs go to ERPLAB > Compute averaged ERPs
click RUN
. You will be prompted to save new dataset, name it and save it. To plot ERP, go to ERPLAB > Plot ERP waveforms
. Select channel 20 – FCz.
Computing MMN difference wave
To compute the difference wave go to ERPLAB > ERP Operations > ERP Bin Operations
, and type b3 = b1 - b2
. That is a new bin that reflects the difference between the deviant and the standard tones. Now, you can plot with ERPLAB > Plot ERP waveform
by selecting only bin 3. And voila, here's your MMN!
FAQ
Why taking notes during data collection can be helpful during analysis?
Knowing in advance that there is something wonky in your data means that (1) you don’t have to scroll through all your data to identify bad channels, and (2) you don’t have to redo your analyses.
Why all file names are prefixed with 1?
When processing EEG data, you need to process data from multiple participants. Here, we’re looking at the data of only one participant (Subject 1 from the downloaded dataset), so this is what this number represents. This is to know which subject’s data your processing and of course not to overwrite your previously processed subject with new data.
First published on: June 26, 2025