The Research Data Scientist position will work closely with the Program Director of the TraCS Data Science Lab (TDSL) and TraCS leaders and team members to support clinical data science across UNC. The Research Data Scientist will (1) collaborate on translational research using real world data, (2) design and build novel data resources and statistical methods to enhance and improve clinical data science, and (3) create curricula and training resources for a clinical data science education program at UNC. Depending on Program priorities and personal interests, there is flexibility in the research data scientist role to develop their own extramural funding for research priorities in-line with the overarching mission of the TDSL Program at UNC.
Minimum Education and Experience Requirements
Relevant post-Baccalaureate degree required (or foreign degree equivalent); for candidates demonstrating comparable independent research productivity, will accept a relevant Bachelor’s degree (or foreign degree equivalent) and 3 or more years of relevant experience in substitution. May require terminal degree and licensure.
Required Qualifications, Competencies, and Experience
This position requires a Masters or PhD in applicable field and at least 2 years of relevant work experience. Excellent written and oral scientific communication skills as well as the ability to independently manage multiple projects is also required.
Preferred Qualifications, Competencies, and Experience
Professional and/or formal educational experience utilizing large language models for research (deploying, tuning, optimizing open & closed source models in local & cloud environments)
Experience with natural language processing utilizing traditional data science approaches (word2Vec, nltk, spark NLP, etc) and/or NLP utilizing large language models
Demonstrated skill in semantic web technologies.
High level of proficiency in either R or Python in the areas of machine learning, deep learning, statistical analysis, computer vision, and/or graph analysis.
Experience with data engineering to create data science pipelines, deployment and scale models, and maintain and thoroughly document code and models.
Knowledge of SQL programming, data modeling, and relational database systems such as Oracle, MS-SQL, MySQL, etc.
Excellent independent work function with ability to manage multiple projects at once while delivering high-quality work on time.
Strong interest in clinical data and desire to develop deep expertise in EHR and claims data.
Strong commitment to teaching and disseminating work through writing and speaking engagements. Creating and giving seminars & presentations and authoring manuscripts is an expectation.
Fluency visualizing data for scientific purposes (e.g., creating figures for manuscripts and presentations) .
Experience and interest in academic research (working with faculty, disseminating work through manuscripts & open source software, and grant writing).