Neural Networks
Description of Data and Task
In this tutorial, we will reproduce the facial emotion recognition
task from the slides using a pre-trained Convolutional Neural Network
(CNN) from Hugging Face that is fine-tuned for facial emotion
recognition based on image data (i.e.,
“dennisjooo/emotion_classification”).
Tasks
- Setup R by installing the
reticulate package
(R-interface to Python) and miniconda (small Python
distribution). Then install the Python packages torch and
tf-keras for NNs, transformers for
Transformers, and datasets for access to the Hugging Face
database.
if (!('reticulate' %in% installed.packages())) {
# Python:
install.packages("reticulate")
library(reticulate)
install_miniconda()
# NNs:
reticulate::py_install('torch', pip=T)
reticulate::py_install('tf-keras', pip=T)
# Transformers:
reticulate::py_install('transformers', pip=T)
# Hugging Face:
reticulate::py_install('datasets', pip=T)
}
- Import the
datasets Python package into your current R
session using the import() function (i.e., similar to
loading R packages with library()). Note that you have to
save Python packages as variables (i.e., they also occur in your
“Environment” in RStudio, similar to defining x = 1), and
that you can only use it by calling this variable each time you want to
use a function from this package using the syntax
package$function() (i.e., (a) similar to the
mlr3verse syntax, where models etc. are objects that have
certain attributes (e.g., functions) that can be accessed with
object$attribute as well as (b) similar to accessing
functions from R packages without loading the entire package using the
syntax package::function()).
- Load the dataset “FastJobs/Visual_Emotional_Analysis” from Hugging
Face into R and save it as “dat”. Make sure to download only the
training data by calling the
load_dataset() function using
the argument split = 'train'.
- Transform the Python variable “dat” into a data type that can be
used in R (e.g., a
tibble; Hint: If you copy the code from
the slides, you should also include the part where the true numeric
classes are translated to verbal labels).
- Load the pre-trained CNN “dennisjooo/emotion_classification” as a
pipeline (i.e., incl. the associated image data pre-processor with the
same name/path at Hugging Face) and save it as “ppl” to be used for a
facial emotion recognition task below.
- For one randomly selected image from the dataset “dat”, use the
pipleline from task 5 to predict the facial emotion. According to the
model, what are the most likely and least likely emotions that the
person in the image is displaying? Note that we only do this task for
one of the images because it will take a while to run for the whole
dataset, but feel free to play around with it as you like.
- Bonus: Explore other pre-trained models at https://huggingface.co/models. Note that you can filter
all available models by the task(s) to which they can be applied, or by
the data they use. The “Model card” usually contains some useful
information about how the model can be used (in Python; Hint: Use GPT to
translate to R-
reticulate code). Furthermore, note that
since Hugging Face is open source, anyone can publish models or datasets
there, so there is also a lot of crappy stuff available (e.g., a
positive example is https://huggingface.co/j-hartmann/emotion-english-distilroberta-base,
but a negative one is https://huggingface.co/MG31/license_aug). Pay attention
to the number of likes and downloads, which are indicative of whether
the model/dataset might be useful or not.
LS0tDQp0aXRsZTogIk1vZHVsZSA0OiBUdXRvcmlhbDogRGVlcCBMZWFybmluZyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KZWRpdG9yX29wdGlvbnM6IA0KICBjaHVua19vdXRwdXRfdHlwZTogaW5saW5lDQotLS0NCg0KIyBOZXVyYWwgTmV0d29ya3MNCg0KIyMgRGVzY3JpcHRpb24gb2YgRGF0YSBhbmQgVGFzaw0KDQpJbiB0aGlzIHR1dG9yaWFsLCB3ZSB3aWxsIHJlcHJvZHVjZSB0aGUgZmFjaWFsIGVtb3Rpb24gcmVjb2duaXRpb24gdGFzayBmcm9tIHRoZSBzbGlkZXMgdXNpbmcgYSBwcmUtdHJhaW5lZCBDb252b2x1dGlvbmFsIE5ldXJhbCBOZXR3b3JrIChDTk4pIGZyb20gSHVnZ2luZyBGYWNlIHRoYXQgaXMgZmluZS10dW5lZCBmb3IgZmFjaWFsIGVtb3Rpb24gcmVjb2duaXRpb24gYmFzZWQgb24gaW1hZ2UgZGF0YSAoaS5lLiwgImRlbm5pc2pvb28vZW1vdGlvbl9jbGFzc2lmaWNhdGlvbiIpLg0KDQojIyBUYXNrcw0KDQoxLiAgU2V0dXAgUiBieSBpbnN0YWxsaW5nIHRoZSBgcmV0aWN1bGF0ZWAgcGFja2FnZSAoUi1pbnRlcmZhY2UgdG8gUHl0aG9uKSBhbmQgYG1pbmljb25kYWAgKHNtYWxsIFB5dGhvbiBkaXN0cmlidXRpb24pLiBUaGVuIGluc3RhbGwgdGhlIFB5dGhvbiBwYWNrYWdlcyBgdG9yY2hgIGFuZCBgdGYta2VyYXNgIGZvciBOTnMsIGB0cmFuc2Zvcm1lcnNgIGZvciBUcmFuc2Zvcm1lcnMsIGFuZCBgZGF0YXNldHNgIGZvciBhY2Nlc3MgdG8gdGhlIEh1Z2dpbmcgRmFjZSBkYXRhYmFzZS4NCg0KYGBge3J9DQppZiAoISgncmV0aWN1bGF0ZScgJWluJSBpbnN0YWxsZWQucGFja2FnZXMoKSkpIHsgI1JlbW92ZSB0aGlzIGZvciBmaXJzdCBydW4NCiAgIyBQeXRob246DQogIGluc3RhbGwucGFja2FnZXMoInJldGljdWxhdGUiKQ0KICBsaWJyYXJ5KHJldGljdWxhdGUpDQogIGluc3RhbGxfbWluaWNvbmRhKCkNCiAgI3B5X2NvbmZpZygpDQogICN1c2VfY29uZGFlbnYoInItcmV0aWN1bGF0ZSIpDQogIA0KICAjIE5OczoNCiAgcmV0aWN1bGF0ZTo6cHlfaW5zdGFsbCgndG9yY2gnLCBwaXA9VCkNCiAgcmV0aWN1bGF0ZTo6cHlfaW5zdGFsbCgndGYta2VyYXMnLCBwaXA9VCkNCiAgDQogICMgVHJhbnNmb3JtZXJzOg0KICByZXRpY3VsYXRlOjpweV9pbnN0YWxsKCd0cmFuc2Zvcm1lcnMnLCBwaXA9VCkNCiAgDQogICMgSHVnZ2luZyBGYWNlOg0KICByZXRpY3VsYXRlOjpweV9pbnN0YWxsKCdkYXRhc2V0cycsIHBpcD1UKQ0KfQ0KYGBgDQoNCjIuICBJbXBvcnQgdGhlIGBkYXRhc2V0c2AgUHl0aG9uIHBhY2thZ2UgaW50byB5b3VyIGN1cnJlbnQgUiBzZXNzaW9uIHVzaW5nIHRoZSBgaW1wb3J0KClgIGZ1bmN0aW9uIChpLmUuLCBzaW1pbGFyIHRvIGxvYWRpbmcgUiBwYWNrYWdlcyB3aXRoIGBsaWJyYXJ5KClgKS4gTm90ZSB0aGF0IHlvdSBoYXZlIHRvIHNhdmUgUHl0aG9uIHBhY2thZ2VzIGFzIHZhcmlhYmxlcyAoaS5lLiwgdGhleSBhbHNvIG9jY3VyIGluIHlvdXIgIkVudmlyb25tZW50IiBpbiBSU3R1ZGlvLCBzaW1pbGFyIHRvIGRlZmluaW5nIGB4ID0gMWApLCBhbmQgdGhhdCB5b3UgY2FuIG9ubHkgdXNlIGl0IGJ5IGNhbGxpbmcgdGhpcyB2YXJpYWJsZSBlYWNoIHRpbWUgeW91IHdhbnQgdG8gdXNlIGEgZnVuY3Rpb24gZnJvbSB0aGlzIHBhY2thZ2UgdXNpbmcgdGhlIHN5bnRheCBgcGFja2FnZSRmdW5jdGlvbigpYCAoaS5lLiwgKGEpIHNpbWlsYXIgdG8gdGhlIGBtbHIzdmVyc2VgIHN5bnRheCwgd2hlcmUgbW9kZWxzIGV0Yy4gYXJlIG9iamVjdHMgdGhhdCBoYXZlIGNlcnRhaW4gYXR0cmlidXRlcyAoZS5nLiwgZnVuY3Rpb25zKSB0aGF0IGNhbiBiZSBhY2Nlc3NlZCB3aXRoIGBvYmplY3QkYXR0cmlidXRlYCBhcyB3ZWxsIGFzIChiKSBzaW1pbGFyIHRvIGFjY2Vzc2luZyBmdW5jdGlvbnMgZnJvbSBSIHBhY2thZ2VzIHdpdGhvdXQgbG9hZGluZyB0aGUgZW50aXJlIHBhY2thZ2UgdXNpbmcgdGhlIHN5bnRheCBgcGFja2FnZTo6ZnVuY3Rpb24oKWApLg0KDQpgYGB7cn0NCg0KYGBgDQoNCjMuICBMb2FkIHRoZSBkYXRhc2V0ICJGYXN0Sm9icy9WaXN1YWxfRW1vdGlvbmFsX0FuYWx5c2lzIiBmcm9tIEh1Z2dpbmcgRmFjZSBpbnRvIFIgYW5kIHNhdmUgaXQgYXMgImRhdCIuIE1ha2Ugc3VyZSB0byBkb3dubG9hZCBvbmx5IHRoZSB0cmFpbmluZyBkYXRhIGJ5IGNhbGxpbmcgdGhlIGBsb2FkX2RhdGFzZXQoKWAgZnVuY3Rpb24gdXNpbmcgdGhlIGFyZ3VtZW50IGBzcGxpdCA9ICd0cmFpbidgLg0KDQpgYGB7cn0NCg0KYGBgDQoNCjQuIFRyYW5zZm9ybSB0aGUgUHl0aG9uIHZhcmlhYmxlICJkYXQiIGludG8gYSBkYXRhIHR5cGUgdGhhdCBjYW4gYmUgdXNlZCBpbiBSIChlLmcuLCBhIGB0aWJibGVgOyBIaW50OiBJZiB5b3UgY29weSB0aGUgY29kZSBmcm9tIHRoZSBzbGlkZXMsIHlvdSBzaG91bGQgYWxzbyBpbmNsdWRlIHRoZSBwYXJ0IHdoZXJlIHRoZSB0cnVlIG51bWVyaWMgY2xhc3NlcyBhcmUgdHJhbnNsYXRlZCB0byB2ZXJiYWwgbGFiZWxzKS4NCg0KYGBge3J9DQoNCmBgYA0KDQo1LiBMb2FkIHRoZSBwcmUtdHJhaW5lZCBDTk4gImRlbm5pc2pvb28vZW1vdGlvbl9jbGFzc2lmaWNhdGlvbiIgYXMgYSBwaXBlbGluZSAoaS5lLiwgaW5jbC4gdGhlIGFzc29jaWF0ZWQgaW1hZ2UgZGF0YSBwcmUtcHJvY2Vzc29yIHdpdGggdGhlIHNhbWUgbmFtZS9wYXRoIGF0IEh1Z2dpbmcgRmFjZSkgYW5kIHNhdmUgaXQgYXMgInBwbCIgdG8gYmUgdXNlZCBmb3IgYSBmYWNpYWwgZW1vdGlvbiByZWNvZ25pdGlvbiB0YXNrIGJlbG93Lg0KDQpgYGB7cn0NCg0KYGBgDQoNCjYuICBGb3Igb25lIHJhbmRvbWx5IHNlbGVjdGVkIGltYWdlIGZyb20gdGhlIGRhdGFzZXQgImRhdCIsIHVzZSB0aGUgcGlwbGVsaW5lIGZyb20gdGFzayA1IHRvIHByZWRpY3QgdGhlIGZhY2lhbCBlbW90aW9uLiBBY2NvcmRpbmcgdG8gdGhlIG1vZGVsLCB3aGF0IGFyZSB0aGUgbW9zdCBsaWtlbHkgYW5kIGxlYXN0IGxpa2VseSBlbW90aW9ucyB0aGF0IHRoZSBwZXJzb24gaW4gdGhlIGltYWdlIGlzIGRpc3BsYXlpbmc/IE5vdGUgdGhhdCB3ZSBvbmx5IGRvIHRoaXMgdGFzayBmb3Igb25lIG9mIHRoZSBpbWFnZXMgYmVjYXVzZSBpdCB3aWxsIHRha2UgYSB3aGlsZSB0byBydW4gZm9yIHRoZSB3aG9sZSBkYXRhc2V0LCBidXQgZmVlbCBmcmVlIHRvIHBsYXkgYXJvdW5kIHdpdGggaXQgYXMgeW91IGxpa2UuDQoNCmBgYHtyfQ0KDQpgYGANCg0KNy4gIEJvbnVzOiBFeHBsb3JlIG90aGVyIHByZS10cmFpbmVkIG1vZGVscyBhdCBodHRwczovL2h1Z2dpbmdmYWNlLmNvL21vZGVscy4gTm90ZSB0aGF0IHlvdSBjYW4gZmlsdGVyIGFsbCBhdmFpbGFibGUgbW9kZWxzIGJ5IHRoZSB0YXNrKHMpIHRvIHdoaWNoIHRoZXkgY2FuIGJlIGFwcGxpZWQsIG9yIGJ5IHRoZSBkYXRhIHRoZXkgdXNlLiBUaGUgIk1vZGVsIGNhcmQiIHVzdWFsbHkgY29udGFpbnMgc29tZSB1c2VmdWwgaW5mb3JtYXRpb24gYWJvdXQgaG93IHRoZSBtb2RlbCBjYW4gYmUgdXNlZCAoaW4gUHl0aG9uOyBIaW50OiBVc2UgR1BUIHRvIHRyYW5zbGF0ZSB0byBSLWByZXRpY3VsYXRlYCBjb2RlKS4gRnVydGhlcm1vcmUsIG5vdGUgdGhhdCBzaW5jZSBIdWdnaW5nIEZhY2UgaXMgb3BlbiBzb3VyY2UsIGFueW9uZSBjYW4gcHVibGlzaCBtb2RlbHMgb3IgZGF0YXNldHMgdGhlcmUsIHNvIHRoZXJlIGlzIGFsc28gYSBsb3Qgb2YgY3JhcHB5IHN0dWZmIGF2YWlsYWJsZSAoZS5nLiwgYSBwb3NpdGl2ZSBleGFtcGxlIGlzIGh0dHBzOi8vaHVnZ2luZ2ZhY2UuY28vai1oYXJ0bWFubi9lbW90aW9uLWVuZ2xpc2gtZGlzdGlscm9iZXJ0YS1iYXNlLCBidXQgYSBuZWdhdGl2ZSBvbmUgaXMgaHR0cHM6Ly9odWdnaW5nZmFjZS5jby9NRzMxL2xpY2Vuc2VfYXVnKS4gUGF5IGF0dGVudGlvbiB0byB0aGUgbnVtYmVyIG9mIGxpa2VzIGFuZCBkb3dubG9hZHMsIHdoaWNoIGFyZSBpbmRpY2F0aXZlIG9mIHdoZXRoZXIgdGhlIG1vZGVsL2RhdGFzZXQgbWlnaHQgYmUgdXNlZnVsIG9yIG5vdC4NCg0K