Overview
This proposed PhD research aims to conduct an in-depth investigation into the security and privacy risks associated with large language models (LLMs). As the deployment of LLMs becomes increasingly prevalent across various domains, including security-sensitive domains, understanding and mitigating potential threats is paramount. The focus will be on studying jailbreaks, evasion from safeguards, and data privacy attacks, with the ultimate goal of proposing robust defensive strategies towards safe LLMs.
Abstract:
This proposed PhD research aims to conduct an in-depth investigation into the security and privacy risks associated with large language models (LLMs). As the deployment of LLMs becomes increasingly prevalent across various domains, understanding and mitigating potential threats is paramount. The focus will be on studying jailbreaks, evasion from safeguards, and data privacy attacks, with the ultimate goal of proposing robust defensive strategies.
Research Objectives:
1. Jailbreak Analysis:
– Examine the vulnerabilities that may lead to the compromise of LLMs, allowing unauthorized access or manipulation of their underlying systems.
– Investigate the implications of jailbreaks in terms of model integrity, user trust, and the broader security landscape.
2. Safeguard Evasion:
– Explore methods by which LLMs can potentially bypass or evade existing security safeguards, such as adversarial attacks or exploitation of weaknesses in protective mechanisms.
– Assess the effectiveness of current safeguards and identify areas for improvement to enhance resilience against evasion attempts.
3. Data Privacy Attacks:
– Investigate the risks posed by LLMs in compromising user data privacy, both in terms of input data and generated output.
– Analyse potential scenarios where LLMs might inadvertently leak sensitive information and assess the magnitude of privacy threats.
4. Defensive Strategies:
– Devise and propose advanced defensive mechanisms to safeguard LLMs against jailbreaks, safeguard evasion, and data privacy attacks.
– Evaluate the efficacy of the proposed defenses through simulations, empirical studies, and real-world scenarios.
Significance:
This research will contribute to the growing body of knowledge on LLM security and privacy, providing insights that can inform the development of safer and more secure language models. The proposed defensive strategies aim to mitigate the identified risks, fostering the responsible deployment of LLMs in diverse applications.
Funding Information
To be eligible for consideration for a Home DfE or EPSRC Studentship (covering tuition fees and maintenance stipend of approx. £19,237 per annum), a candidate must satisfy all the eligibility criteria based on nationality, residency and academic qualifications.
To be classed as a Home student, candidates must meet the following criteria and the associated residency requirements:
• Be a UK National,
or • Have settled status,
or • Have pre-settled status,
or • Have indefinite leave to remain or enter the UK.
Candidates from ROI may also qualify for Home student funding.
Previous PhD study MAY make you ineligible to be considered for funding.
Please note that other terms and conditions also apply.
Please note that any available PhD studentships will be allocated on a competitive basis across a number of projects currently being advertised by the School.
A small number of international awards will be available for allocation across the School. An international award is not guaranteed to be available for this project, and competition across the School for these awards will be highly competitive.
Academic Requirements:
The minimum academic requirement for admission is normally an Upper Second Class Honours degree from a UK or ROI Higher Education provider in a relevant discipline, or an equivalent qualification acceptable to the University.