Ofertes de projectes

Consulta ofertes d'altres estudis i especialitats

Leveraging Artificial Intelligence to Tackle Lung Cancer

Resum

Lung cancer remains the leading cause of cancer-related deaths worldwide. AI has recently emerged as a transformative tool for enhancing medical decision-making. However, its widespread adoption faces several challenges, including data quality, model transparency, and interpretability. This thesis seeks to explore how innovative AI techniques can revolutionize lung cancer research and treatment, offering new opportunities to address these challenges. It aims to contribute to the broader application of AI in healthcare.

Titulació

MDS

Direcció

ALBERTO CABELLOS APARICIO

Departament

AC

Descripció

This project offers the candidate a unique opportunity to apply artificial intelligence techniques to real-world challenges in lung cancer research and treatment. As part of this thesis, the candidate will work with datasets, including patient records, genetic data, molecular alterations, treatment outcomes, and exposome data. These datasets will serve as the foundation for developing AI models that address critical challenges in lung cancer treatment, such as predicting patient outcomes and identifying optimal treatment strategies.

The candidate will focus on the following core tasks:

Data Exploration and Preprocessing: The candidate will gain experience in handling complex medical datasets by cleaning, preparing, and structuring the data to ensure it is suitable for advanced AI analysis.

Building AI Models: Using machine learning and deep learning techniques, the candidate will develop models aimed at predicting lung cancer progression, evaluating treatment efficacy, and understanding the impact of various environmental and genetic factors.

Interpretability and Explainability: A significant emphasis will be placed on making AI models interpretable and transparent. The candidate will explore techniques to ensure that the models produced are not just accurate but also explainable, providing healthcare professionals with clear insights into the model's predictions and decisions.

Exploring Interaction Networks: The candidate will analyze interaction networks, studying relationships between patient genetics, environmental factors, and treatment responses to identify key drivers of lung cancer outcomes.

Throughout the project, the candidate will not only gain hands-on experience with cutting-edge AI tools and methodologies but also develop a deeper understanding of AI's role in healthcare. This project provides an impactful opportunity to contribute to a field where AI innovation can directly improve patient outcomes.

Requisits mínims

- Machine Learning

- Tensorflow/Pytorch

Deep Learning-based optimization of electric power systems

Resum

This TFM is remunerated through a part-time research internship of 600€/month. This project will address the problem of optimization of power flows in electric power systems with high penetration of renewable generation using Deep Reinforcement Learning and Graph Neural Networks.

Distributed training of LLMs (paid position, with Qualcomm)

Resum

State-of-the-art models such as LLMs are too large to fit in a single compute node (GPU, NPU, CPU), both for training and inference on a device (e.g., phone, laptop, tablet) or in larger-scale data centers. There is a need to develop optimization techniques to split and place these models onto a distributed set of compute nodes so that the overall system performance is maximized. The research will be focused on optimizing the placement of AI models onto distributed systems considering training time, energy consumption, and computational resources.

Titulació

MDS

Direcció

SERGI ABADAL CAVALLÉ

Departament

AC

Descripció

State-of-the-art models such as LLMs are too large to fit in a single compute node (GPU, NPU, CPU), both for training and inference on a device (e.g., phone, laptop, tablet) or in larger-scale data centers. There is a need to develop optimization techniques to split and place these models onto a distributed set of compute nodes so that the overall system performance is maximized. The research will be focused on optimizing the placement of AI models onto distributed systems considering training time, energy consumption, and computational resources.

In this thesis, the student will model and simulate distributed AI workloads using both mathematical frameworks and simulators. This encompasses the modeling of network, compute, and memory components within a distributed architecture. Developing new modules to enhance the modeling process. Evaluating and optimizing various parallelization techniques to improve overall system performance.

The Universitat Politècnica de Catalunya · BarcelonaTech offers Master thesis fellowships in the field of LLM training. The research will be supported by Qualcomm and will be carried out in an environment with a strong interaction with leading experts in the field, with opportunities for doing internships in the company.

More information here: https://www.cs.upc.edu/~jordicf/priv/eda/llm_qc.html

Requisits mínims

Students with strong background on Computer/Data Science and/or Mathematics are required, preferably with skills on:
Algorithms and data structures
Mathematical optimization
Machine and deep learning
Network or Computer Architecture
Oral and written English

Leveraging hemodynamic changes acquired by fNIRS to guide the training of Large AI models

Resum

We want to demonstrate experimentally that augmenting a model with fNIRS data carries neural activity features complementing the information captured by the model and demonstrate that it improves the models' performance. To this end, we will have to collect data from participants and test how different Transformer models benefit from different types of fNIRS attention masks.

Titulació

MDS

Direcció

SERGI ABADAL CAVALLÉ

Departament

AC

Descripció

Functional near-infrared spectroscopy (fNIRS) is a non-invasive neuroimaging technique that measures changes in oxygenated (HbO2) and deoxygenated hemoglobin (HbR) in the cerebral cortex. Due to its portability and low cost, fNIRS has been used in Brain-Computer Interface (BCI) applications, characterizing hemodynamic responses to varying stimuli, and investigating auditory and visual-spatial attention during Complex Scene Analysis (CSA). In this project, we want to design and implement an fNIRS study with a goal of studying the impact of neural and BCI outcomes to improving the training of LAI models' attention mechanism (e.g., Transformer attention) during reading comprehension tasks (e.g., the participants will be judging the quality of generated text). We want to demonstrate experimentally that augmenting a model with fNIRS data carries neural activity features complementing the information captured by the model and demonstrate that it improves the models' performance. To this end, we will have to collect data from participants and test how different Transformer models benefit from different types of fNIRS attention masks.

The candidate will:

Carry out controlled studies and collect data from a number of participants.
Try different approaches to incorporating fNIRS signal in the training process of DL/LLM models and compare against baselines/experiments (these should be replicated).
Carry out ablation studies, personalisation techniques, error analyses, etc.
Contribute to authoring a scientific article

Requisits mínims

Coding skills in Python
Familiarity with fNIRS technologies, e.g., Cortivision, Artinis (desirable, not obligatory)
Experience with controlled experimentation methods for data acquisition (desirable, not obligatory)
Experience in ML/DL techniques
Experience in Signal Processing techniques

Leveraging human attention with eye tracking to guide the training of Large AI models

Resum

We want to demonstrate experimentally that augmenting a model with eye tracking (ET) data carries linguistic features complementing the information captured by the model and demonstrate that it improves the models' performance. To this end, we will have to collect data from participants and test how different Transformer models benefit from different types of ET attention masks.

Titulació

MDS

Direcció

SERGI ABADAL CAVALLÉ

Departament

AC

Descripció

Eye movement features are considered to be direct signals reflecting human attention distribution with a low cost to obtain, inspiring researchers to augment language models with eye-tracking (ET) data. In this project, we want to investigate how to operationalise eye tracking (ET) features, such as first fixation duration (FFD) and total reading time (TRT), as the cognitive signals to augment LAI models' attention mechanism (e.g., Transformer attention) during training. We want to demonstrate experimentally that augmenting a model with ET data carries linguistic features complementing the information captured by the model and demonstrate that it improves the models' performance. To this end, we will have to collect data from participants and test how different Transformer models benefit from different types of ET attention masks.

The candidate will:

Carry out controlled studies and collect data from a number of participants.
Try different approaches to incorporating gaze features in the training process of DL/LLM models and compare against baselines/experiments (these should be replicated).
Carry out ablation studies, personalisation techniques, error analyses, etc.
Contribute to authoring a scientific article

Requisits mínims

- Coding skills in Python
- Familiarity with eye tracking technologies, e.g., Tobii Pro Fusion (desirable, not obligatory)
- Experience with controlled experimentation methods for data acquisition (desirable, not obligatory)
- Experience in ML/DL techniques
- Experience in Signal Processing techniques

Detecting Cognitive Distortions in Social-Media Text with Large Language Models (LLMs) to Strengthen Digital Mental-Health Screening

Resum

Cognitive distortions are systematic, biased thought patterns that skew how people interpret events, themselves, and others. They are transdiagnostic, contributing to the onset and maintenance of numerous mental-health disorders. Objectives: 1. Develop and validate an LLM-based pipeline that: 2. Quantify the frequency and subtype distribution of cognitive distortions across Reddit mental-health communities representing key diagnoses (depression, anxiety, PTSD, bipolar disorder, OCD, BPD, eating disorders, ADHD, ASD, schizophrenia, DID).

Titulació

MDS

Direcció

CARLOS ESCOLANO PEINADO

Departament

CS

Descripció

1 Background

Cognitive distortions are systematic, biased thought patterns that skew how people interpret events, themselves, and others. They are transdiagnostic, contributing to the onset and maintenance of numerous mental-health disorders (Table 1). Because these thoughts are automatic, emotionally charged, and resistant to rational disconfirmation, they are a core target of psychotherapy (psycholigical interventions), especially cognitive-behavioural therapy (CBT).

Table 1. Mental health diagnoses and illustrative cognitive distortions

Diagnosis

Illustrative Distortion

Major Depressive Disorder

"I'm a complete failure."

Anxiety Disorders (GAD, social, panic)

"Something awful will happen and I won't cope."

Bipolar Disorder ¿ mania

"Nothing can go wrong; I'm invincible."

Obsessive¿Compulsive Disorder

"If I don't perform this ritual, disaster is certain."

Eating Disorders (AN, BN, BED)

"One 'bad' food will make me fat and worthless."

Borderline Personality Disorder

"If they don't reply at once, they must hate me."

PTSD / Complex PTSD

"The trauma was my fault-I should have stopped it."

Schizophrenia-spectrum

"Strangers are definitely plotting against me."

ADHD (rejection sensitivity)

"I always mess things up-people will give up on me."

Autism Spectrum Disorder

"If I break my routine, the day is ruined."

Dissociative Identity Disorder

"When things go wrong, it's my other parts' fault."

Mental health disorders are highly prevalent [Santomauro 2021], typically begin in adolescence or early adulthood [Solmi 2022], shorten life expectancy by ¿ 10 years [Walker 2015], and-despite advances in treatment-remain major contributors to global disability [Ferrari 2022]. Despite the effectiveness of psychotherapy in reducing distorted thinking, many individuals remain undiagnosed and untreated due to limited access to clinicians, stigma, or delayed help-seeking.

The promise of language technology

Large-scale language technologies offer a scalable solution. Natural-language processing (NLP) systems-and, most recently, large language models (LLMs)-can detect cognitive distortions in free text. Proof-of-concept studies using clinician-annotated forum posts already show encouraging accuracy [Simms 2017; Shickel 2019; Shreevastava 2021]. Social-media mining has successfully flagged suicidal ideation [Ramírez-Cifuentes 2020] and anorexia nervosa content [Ramírez-Cifuentes 2021]; LLMs now reach state-of-the-art performance for identifying depression and suicidality in clinical narratives [Lho 2025].

Our group contributes two key building blocks to this research:

Psycholinguistic signature study (Molins et al., in preparation): revealed that distorted thoughts are longer, more emotionally negative, structurally imbalanced, and linguistically incoherent compared to alternative, restructured thoughts. Each distortion type also displays unique emotional and lexical markers.
Classifier suite (Zakreva et al., in preparation): achieved F1 scores of 0.92 for binary detection (distortion vs. non-distortion) and 0.83 for 10-way distortion classification. The models include interpretable outputs, highlighting psychologically coherent lexical and syntactic patterns (e.g., "debería" ¿ should-statement distortion).

These foundational studies enable the development of scalable, explainable tools to detect cognitive distortions in public social-media discourse, opening pathways for early screening, digital phenotyping, and targeted intervention.

Objectives

Develop and validate an LLM-based pipeline that:
1. Automatically detects cognitive distortions in social-media posts.
2. Classifies each distortion into one of ten standard cognitive-distortion types (Table 2).
3. Provide explainable outputs (e.g., salient lexical/syntactic cues contributing to the decision).

Quantify the frequency and subtype distribution of cognitive distortions across Reddit mental-health communities representing key diagnoses (depression, anxiety, PTSD, bipolar disorder, OCD, BPD, eating disorders, ADHD, ASD, schizophrenia, DID).

3 Methods

3.1 Data Collection

Extract Reddit submissions (2021 ¿ 2025) via Pushshift.
Target subreddits: r/depression, r/Anxiety, r/OCD, r/EatingDisorders, r/AnorexiaNervosa, r/BPD, r/PTSD, r/bipolar, r/ADHD, r/autism, r/schizophrenia, r/DID.
Randomly sample control posts from general-interest subreddits.
Remove deleted/flagged content, filter to Spanish language, and fully anonymise.

3.2 Pre-processing

Sentence segmentation; toxicity screening.
Compute psycholinguistic features (valence, arousal, concreteness, dependency depth).

3.3 Model Development

Base model: fine-tuned LLaMA-3 8B (or equivalent instruction-tuned LLM).
Add two heads:
- Binary classifier: distortion present/absent.
- Multi-class classifier: one of 10 cognitive distortion types.
Fine-tune using:
- Manually labeled dataset (clinician-reviewed; interrater ¿ ¿ 0.80),
- Augmented synthetic samples (e.g., contrastive pairs),
- Public datasets harmonized to the distortion taxonomy.

3.4 Evaluation

Metrics : Precision, Recall, Specificity, Accuracy, macro/micro F1, AUROC.
External validity : 5 % held-out set double-coded by expert blinded clinicials.
Explainability : SHAP/attention maps; qualitative review by clinicians.

3.5 Statistical Analysis

Compare frequency and types of cognitive distortions across subreddits using:
- Welch's t-tests (for count data),
- Chi-squared tests (for proportions),
- Mixed-effects models (to adjust for time of post, engagement metrics).
Correct for multiple comparisons using Benjamini-Hochberg.

4 Expected Impact

Clinical Utility: Provide clinicians and moderators with a tool to screen for maladaptive thinking in real-time social media content.
Research Advancement: Enable fine-grained phenotyping of psychiatric language across diagnoses.
Open Science Contribution: Release anonymized datasets, trained models, and an explainable AI pipeline for use in mental health research and technology development.

5 References

5 . References - ordered by first appearance in the text

Santomauro, D. F., Herrera, A. M. M., Shadid, J., et al. (2021). Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. The Lancet, 398(10312), 1700¿1712. https://doi.org/10.1016/S0140-6736(21)02143-7
Solmi, M., Radua, J., Olivola, M., et al. (2022). Age at onset of mental disorders worldwide: A large-scale meta-analysis of 192 epidemiological studies. Molecular Psychiatry, 27(1), 281¿295. https://doi.org/10.1038/s41380-021-01161-9
Walker, E. R., McGee, R. E., & Druss, B. G. (2015). Mortality in mental disorders and global disease-burden implications: A systematic review and meta-analysis. JAMA Psychiatry, 72(4), 334¿341. https://doi.org/10.1001/jamapsychiatry.2014.2502
Ferrari, A. J., Mantilla-Herrera, A. M., Shadid, J., et al. (2022). Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990¿2019: A systematic analysis for the Global Burden of Disease Study 2019. The Lancet Psychiatry, 9(2), 137¿150. https://doi.org/10.1016/S2215-0366(21)00395-3
Simms, T., Ramakrishnan, S., Karmakar, C., & Mendoza, A. (2017). Detecting cognitive distortions through machine-learning text analytics. In Proceedings of the 2017 IEEE International Conference on Healthcare Informatics (ICHI 2017) (pp. 508¿512). IEEE. https://doi.org/10.1109/ICHI.2017.39
Shickel, B., Berry, A., Rashidi, P., & Florida, R. (2019). Automatic detection and classification of cognitive distortions in mental-health text. In Proceedings of the IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE 2020) (pp. 275¿280). IEEE. https://doi.org/10.1109/BIBE50027.2020.00052
Shreevastava, S., & Foltz, P. W. (2021). Detecting cognitive distortions from patient-therapist interactions. In Proceedings of the 7th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2021) (pp. 151¿158). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.clpsych-1.17
Ramírez-Cifuentes, D., Freire, A., Baeza-Yates, R., et al. (2020). Detection of suicidal ideation on social media: Multimodal, relational, and behavioral analysis. Journal of Medical Internet Research, 22(7), e17758. https://doi.org/10.2196/17758
Ramírez-Cifuentes, D., Freire, A., Baeza-Yates, R., et al. (2021). Characterization of anorexia nervosa on social media: Textual, visual, relational, behavioral, and demographical analysis. Journal of Medical Internet Research, 23(7), e25925. https://doi.org/10.2196/25925
Lho, S. K., Park, S. C., Lee, H., et al. (2025). Large language models and text embeddings for detecting depression and suicide in patient narratives. JAMA Network Open, 8(5), e2511922. https://doi.org/10.1001/jamanetworkopen.2025.11922

Requisits mínims

Previous knowledge in Python scripting and Natural Language Processing.

The Human Intranet and its Healthcare Applications

Resum

Recent advancements in nanotechnology have enabled the concept of the "Human Intranet", where devices inside and on our body can sense and communicate, opening the door to multiple exciting applications in the healthcare domain. This thesis aims to delve into the computing, communication, and localization aspects of the "Human Intranet" and how to practically realize them in the next decade.

Titulació

MDS

Direcció

SERGI ABADAL CAVALLÉ

Departament

AC

Descripció

Recent advancements in nanotechnology have enabled the development of means for sensing and wireless communications with unprecedented miniaturization and capabilities, to the point that they can be introduced into the gastrointestinal tract inside a pill or into the bloodstream in the form of passively flowing nanomachines.

This opens the door to the idea of intra-body communication networks, this is, a swarm of nanosensors inside the human body that use communications to coordinate their actions to sense and localize specific events (lack of oxygen, biomarkers, etc). This can lead to the development of applications such as continuous monitoring of diabetes, detection and localization of cancer micro-tumors, or early detection of blood clots. These possibilities are currently investigated by our team at the N3Cat (www.n3cat.upc.edu).

In this context, we are looking for excellent and self-motivated individuals who are eager to work on developing AI schemes (based on graph neural networks or multi-agent RL) for the detection and localization of events inside of the human body. Data will be gathered with an in-house simulator that integrates mobility models (BloodVoyagerS) and communication models (TeraSim).

Requisits mínims

Students wanting to apply are required to have a notable student record. Moreover, candidates are expected to be:
- Highly motivated and self-sufficient,
- Willing to learn, pay attention to detail,
- With a good level of English (both oral and written).

Psycholinguistic Analysis of Speech in Bipolar Disorder during Different Phases (Manic, Depressive, and Euthymic)

Resum

Bipolar disorder (BD) is a chronic and disabling psychiatric condition, typically emerging in adolescence or early adulthood. It is defined by recurrent mood episodes-mania and depression-alternating with periods of relative mood stability, known as euthymia (Nierenberg et al., 2023). Objective: To investigate the psycholinguistic features of speech in BD using cognitive network science, focusing on: - Core actors and self-referential language - Semantic framing and emotional profiles - Structural properties of language networks across illness phases

Titulació

MDS

Direcció

CARLOS ESCOLANO PEINADO

Departament

CS

Descripció

Background
Bipolar disorder (BD) is a chronic and disabling psychiatric condition, typically emerging in adolescence or early adulthood. It is defined by recurrent mood episodes-mania and depression-alternating with periods of relative mood stability, known as euthymia (Nierenberg et al., 2023). Each phase exhibits distinct cognitive, emotional, behavioral, and linguistic profiles:

Mania is characterized by elevated energy, rapid speech, and inflated self-esteem or grandiosity.
Depression involves psychomotor slowing, reduced speech output, and pervasive hopelessness.
Euthymia is considered a phase of clinical remission, though subtle cognitive and emotional disturbances may persist.

Language, as expressed through speech, provides a unique window into cognitive and emotional states and is the pillar of the psychiatric evaluation. Changes in mood, cognition, and energy are mirrored in speech along several dimensions:

Formal features (e.g., fluency, latency, syntactic complexity) differ by phase: manic speech is often pressured and tangential, while depressive speech tends to be slowed and impoverished (Weiner et al., 2019).
Content features (e.g., semantic coherence, conceptual diversity) reflect thought structure: tangentiality and circumstantiality often emerge during mania; depressive speech may show rumination and narrowed semantic range (Iter, Yoon, & Jurafsky, 2018).
Emotional features (e.g., affective word use, valence) distinguish affective states: manic language is often overly positive, while depressive speech contains more negative emotional content (Khorram et al., 2018).

Despite their diagnostic relevance, current assessments rely heavily on subjective clinical impressions, which may lack sensitivity to subtle or early changes and suffer from low inter-rater reliability.

Cognitive network science provides a computational framework to model these psycholinguistic features by representing language as graphs that reflect the structure of thought (Siew et al., 2019). This approach has been applied to the study of suicidal ideation (Teixeira et al., 2021), depression, and anxiety (Fatima et al., 2021), offering novel insights into disordered thinking.

In our recent work, we applied cognitive network and NLP techniques to study cognitive distortions-maladaptive thought patterns (Molins et al., in preparation). Distorted thoughts showed greater length, more negative emotional tone, lower coherence, and specific topological features. Subtypes such as catastrophizing and labeling had distinct profiles: catastrophizing was linked to sadness, labeling to disgust.

Hypothesis
We hypothesize that speech produced during manic, depressive, and euthymic phases in BD exhibits distinct psycholinguistic profiles, including differences in emotional valence, semantic framing, and structural organization. These differences can be objectively quantified and may serve as digital biomarkers of phase-specific cognitive-affective states.

Aims
To investigate the psycholinguistic features of speech in BD using cognitive network science, focusing on:

Core actors and self-referential language
Semantic framing and emotional profiles
Structural properties of language networks across illness phases

We aim to detect hidden cognitive patterns that differ by mood state and may inform future diagnostic, monitoring or treatment-response prediction tools.

Methods
This is a naturalistic observational study. Participants with BD were recruited during acute manic or depressive episodes and recorded twice: once during the episode and again after symptomatic improvement. Euthymic participants were recorded once. All sessions were conducted using a standardized dual-microphone setup. The interview protocol (Figure 1) included:

Clinical assessments: Young Mania Rating Scale (YMRS) and Hamilton Depression Rating Scale (HDRS-17)
Cognitive task: Stroop Test (inhibitory control)
Standard reading: "The Rainbow Passage"
Non-emotional storytelling: "Cookie Theft" picture description
Emotional storytelling: Autobiographical recollection of emotionally significant events

In total, 85 participants were included (25 manic, 25 depressed, 35 euthymic). Audio was diarized and automatically transcribed.

Analysis Plan

(i) Network Construction and Sentiment Annotation

Two types of linguistic networks will be created (Figure 2):

Co-occurrence (CO) networks: capturing word adjacency
Subject-Verb-Object (SVO) networks: capturing syntactic actor-action relationships
Each node will be annotated with valence (+1/¿1/0) and basic emotion labels using the NRC Emotion Lexicon (Mohammad & Turney, 2013; Warriner, Kuperman, & Brysbaert, 2013), with adaptations for Spanish.

(ii) Core Actors and Self-Perception

Using SVO networks, we will quantify the prominence of personal pronouns (e.g., "I") and their associated actions or objects to infer self-perception patterns. Inter-phase comparisons will reveal shifts in egocentric focus and social orientation (Figure 3) (Teixeira et al., 2021).

(iii) Semantic Frames and Emotional Profiles

We will analyze the emotional framing of key concepts-especially self-related terms-across phases. Emotional profiles will be visualized using z-score based emotion wheels and expected vs. observed emotional richness metrics (Figure 4).

Impact
This project may uncover linguistic markers specific to BD mood states, providing insights into cognitive-emotional dynamics and enabling the development of phase-sensitive interventions. As one of the few studies conducted in Spanish, it also supports the cultural adaptation of digital psychiatry tools.

This project enables students to:

Apply NLP and network modeling to real psychiatric speech data
Work on emotion detection, semantic modeling, and graph-based learning.
Contribute to improving mental health diagnostics through objective, explainable AI tools.

Clinical Significance:

May uncover hidden linguistic signatures of manic, depressive, and euthymic states in BD.
Could assist in early detection and tailored treatment planning.
Supports language-specific analysis in Spanish, addressing a major gap in current research.
Offers a scalable, language-based monitoring tool that complements clinical care

References

Nierenberg, A.A. et al. (2023). Diagnosis and Treatment of Bipolar Disorder: A Review. JAMA, 329(14), 1370¿1380. https://doi.org/10.1001/jama.2023.18588
Weiner, L. et al. (2019). Thought and language disturbance in bipolar disorder quantified via process-oriented verbal fluency measures. Scientific Reports, 9(1). https://doi.org/10.1038/s41598-019-50818-5
Iter, D., Yoon, J., & Jurafsky, D. (2018). Automatic Detection of Incoherent Speech for Diagnosing Schizophrenia. Proceedings of ACL, 136¿146. https://doi.org/10.18653/v1/w18-0615
Khorram, S. et al. (2018). The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild. arXiv preprint. http://arxiv.org/abs/1806.10658
Siew, C.S.Q. et al. (2019). Cognitive network science: A review of research on cognition through the lens of network representations. Complexity, 2019. https://doi.org/10.1155/2019/2108423
Teixeira, A.S. et al. (2021). Revealing semantic and emotional structure of suicide notes with cognitive network science. Scientific Reports, 11, 98147. https://doi.org/10.1038/s41598-021-98147-w
Fatima, A. et al. (2021). DASentimental: Detecting Depression, Anxiety, and Stress in Texts via Emotional Recall, Cognitive Networks, and Machine Learning. Big Data and Cognitive Computing, 5(4), 77. https://doi.org/10.3390/BDCC5040077
Warriner, A.B., Kuperman, V., & Brysbaert, M. (2013). Norms of valence, arousal, and dominance for 13,915 English lemmas. Behavior Research Methods, 45, 1191¿1207.
Mohammad, S.M. & Turney, P.D. (2013). Crowdsourcing a Word¿Emotion Association Lexicon. Computational Intelligence, 29(3), 436¿465.

Requisits mínims

Previous knowledge in Python scripting and Natural Language Processing

Scalable Quantum Computing Algorithms and Systems

Resum

Quantum computers promise exponential improvements over conventional ones due to the extraordinary properties of qubits. However, quantum computing faces many challenges relative to the scaling of the algorithms and of the computers that run them. This thesis delves into these challenges and proposes solutions to create scalable quantum computing systems.

ML-based detection of advanced persistent threat (APT) with Telefonica

Resum

This project aims to design novel ML-based advanced persistent threat (APT) detection algorithms. The student will work closely with industry experts from Telefónica on the development of tools and methods to combat APTs.

Titulació

MDS

Direcció

PERE BARLET ROS

Departament

AC

Descripció

Advanced Persistent Threats (APTs) are becoming one of the most significant cybersecurity challenges nowadays. These attacks are characterized by being developed in a stealthy manner and over long periods of time (e.g., months or years), causing financial losses, data breaches, and operational disruptions. Famous examples of APTs are the SolarWinds cyber-attack and NotPetya, affecting multiple government agencies and global enterprises like Maersk. To defend from such threats, methods based on audit logs have been widely developed to detect suspicious behaviors in the system. Specifically, cybersecurity experts switched their attention to data provenance to develop tools and methods to combat APTs. Data provenance consists of parsing system logs and building provenance graphs that describe the entire system execution.

The objective of this thesis is to develop novel ML-based techniques for detecting APTs. During this project, the student will work with industry experts from Telefónica towards the design and implementation of a new set of tools for anomaly detection in audit logs. The student will gain hands-on experience in working with real-world datasets and will acquire technical skills to develop state-of-the-art anomaly detection techniques. In addition, the student will gain experience in graph-based ML, sequence modeling, and advanced embedding techniques.

Objectives:

· Design and develop novel pipelines for processing system log information.
· Research, apply and develop state-of-the-art methods in anomaly detection and cyber threat intelligence.
· Develop techniques for provenance-based intrusion detection systems to trace attack patterns.
· Contribute to research publications and technical documentation to communicate the results to international venues or conferences.

Requisits mínims

Students who are willing to apply are expected to meet the following requirements:

· Highly motivated and self-sufficient
· Willing to learn and pay attention to details
· Minimal AI/ML background
· Proficient in English (spoken and written)

Graph Neural Networks (GNNs): Uses, Architectures, Algorithms

Resum

This thesis aims to explore the possibilities of the new and less studied variant of neural networks called Graph Neural Networks (GNNs). While convolutional networks are good for computer vision or recurrent networks are good for temporal analysis, GNNs are able to learn and model graph-structured relational data, with huge implications in fields such as quantum chemistry, computer networks, or social networks among others.

Titulació

MDS

Direcció

SERGI ABADAL CAVALLÉ

Departament

AC

Descripció

Seeing that not all neural networks fit to all problems, and that relational data is present in a wide variety of aspects of our daily life, the main focus of this thesis in N3Cat (www.n3cat.upc.edu) and BNN-UPC (www.bnn.upc.edu) is to explore the possibilities of the Graph Neural Networks (GNNs), whose aim is to learn and model graph-structured relational data. We are looking for students willing to study the uses, architectures, and algorithms of GNNs. To this end, the candidate will work on ONE of the following areas:

Uses: Applying GNNs in emerging application areas, including but not limited to (1) electroencephalogram (EEG) analysis for Alzheimer's disease detection, epilepsy classification, motor imagery; (2) acceleration of the compilation of quantum computing algorithms
Architectures: Investigating ways to accelerate the processing of GNNs in multiple computing platforms (CPU, GPU, accelerators).
Algorithms: Developing meta-learning data-driven models to estimate the accuracy of a GNN for a given graph, without training, through synthetic graph generation and correlation analysis.

Requisits mínims

Students wanting to apply are required to have a notable student record. Moreover, candidates are expected to be:
- Highly motivated and self-sufficient,
- Willing to learn, pay attention to detail,
- Aware of current trends on machine learning, neural networks,
- With a good level of English (both oral and written).

Excellent students may be eligible for a grant in this offer.

AI in cybersecurity at Nestle SOC

Resum

UPC and Nestlé are offering a new position to develop the TFG / TFM in the field of AI and Cyber Security. This project will be fully funded (internship) and carried out within the Cyber Security Analytics team, part of Nestlé's Global Security Operations Center located in Barcelona.

Titulació

MDS

Direcció

Pere Barlet

Departament

AC

Descripció

In today's digital landscape, the integration of AI and Big Data technologies has become essential for enhancing cyber-resilience for large and small companies. As cyber threats become more and more sophisticated, leveraging those advanced technologies is a must to protect sensitive information and make sure businesses can operate without disruption.

This internship offers a unique opportunity to work at the intersection of all those topics: AI, Big Data, and Cybersecurity. You will join an experienced team of data engineers and data scientists, and you will have the chance to build your own AI models for innovative cybersecurity products. You will also be analyzing and processing large amounts of data and have the opportunity to work with cross-functional teams within the Security Operations Center to integrate these advanced solutions into our SOAR system (Security Orchestration, Automation, and Response).

Requisits mínims

We are particularly interested in candidates with some minimal AI/ML background. Previous experience in Cybersecurity will be also highly valued.

In order to apply, please send your CV and academic transcript (pdf can be generated from the Raco) to both contacts:
- Ignasi Paredes (Nestlé):
- Pere Barlet (UPC):

Consulta ofertes d'altres estudis i especialitats

Ofertes de projectes

On som

Contacta amb la FIB

Ofertes de projectes

Esteu aquí

On som

Contacta amb la FIB