Advisory, Data Scientist - CMC Data Products
Company: Eli Lilly and Company
Location: Indianapolis
Posted on: January 12, 2026
|
|
|
Job Description:
At Lilly, we unite caring with discovery to make life better for
people around the world. We are a global healthcare leader
headquartered in Indianapolis, Indiana. Our employees around the
world work to discover and bring life-changing medicines to those
who need them, improve the understanding and management of disease,
and give back to our communities through philanthropy and
volunteerism. We give our best effort to our work, and we put
people first. We’re looking for people who are determined to make
life better for people around the world. Organizational & Position
Overview: The Bioproduct Research and Development (BR&D)
organization strives to deliver creative medicines to patients by
developing and commercializing insulins, monoclonal antibodies,
novel therapeutic proteins, peptides, oligonucleotide therapies,
and gene therapy systems. This multidisciplinary group works
collaboratively with our discovery and manufacturing colleagues. We
are seeking an exceptional Data Scientist with deep data expertise
in the pharmaceutical domain to lead the development and delivery
of enterprise-scale data products that power AI-driven insights,
process optimization, and regulatory compliance. In this role,
you'll bridge pharmaceutical sciences with modern data engineering
to transform complex CMC, PAT, and analytical data into strategic
assets that accelerate drug development and manufacturing
excellence. Responsibilities: Data Product Development : Define the
roadmap and deliver analysis-ready and AI-ready data products that
enable AI/ML applications, PAT systems, near-time analytical
testing, and process intelligence across CMC workflows. Data
Archetypes & Modern Data Management : Define
pharmaceutical-specific data archetypes (process, analytical,
quality, CMC submission) and create reusable data models aligned
with industry standards (ISA-88, ISA-95, CDISC, eCTD). Modern Data
Management for Regulated Environments : Implement data frameworks
that ensure 21 CFR Part 11, ALCOA, and data integrity compliance,
while enabling scientific innovation and self-service access.
AI/ML-ready Data Products : Build training datasets for lab
automation, process optimization, and predictive CQA models, and
support generative AI applications for knowledge management and
regulatory Q&A. Cross-Functional Leadership: Collaborate with
analytical R&D, process development, manufacturing science,
quality, and regulatory affairs to standardize data products.
Deliverables include: Scalable data integration platform that
automates compilation of technical-review-ready and
submission-ready data packages with demonstrable quality assurance.
Unified CMC data repository supporting current process and
analytical method development while enabling future AI/ML
applications across R&D and manufacturing Data flow frameworks
that enable self-service access while maintaining GxP compliance
and audit readiness Comprehensive documentation, standards, and
training programs that democratize data access and accelerate
product development Basic Requirements: Master’s degree in Computer
Science, Data Science, Machine Learning, AI, or related technical
field 8 years of product management experience focused on data
products, data platforms, or scientific data systems and a strong
grasp of modern data architecture patterns (data warehouses, data
lakes, real-time streaming) Knowledge of modern data stack
technologies (Microsoft Fabric, Databricks, Airflow) and cloud
platforms (AWS- S3, RDS, Lambda/Glue, Azure) Demonstrated
experience designing data products that support AI/ML workflows and
advanced analytics in scientific domains Proficiency with SQL,
Python, and data visualization tools Experience with analytical
instrumentation and data systems (HPLC/UPLC, spectroscopy, particle
characterization, process sensors) Knowledge of pharmaceutical
manufacturing processes, including batch and continuous
manufacturing, unit operations, and process control Expertise in
data modeling for time-series, spectroscopic, chromatographic, and
hierarchical batch/lot data Experience with laboratory data
management systems (LIMS, ELN, SDMS, CDS) and their integration
patterns Additional Preferences Understanding of Design of
Experiments (DoE), Quality by Design (QbD), and process validation
strategies Experience implementing data mesh architectures in
scientific organizations Knowledge of MLOps practices and model
deployment in validated environments Familiarity with regulatory
submissions (eCTD, CTD) and how analytical data supports marketing
applications Experience with CI/CD pipelines (GitHub Actions,
CloudFormation) for scientific applications Lilly is dedicated to
helping individuals with disabilities to actively engage in the
workforce, ensuring equal opportunities when vying for positions.
If you require accommodation to submit a resume for a position at
Lilly, please complete the accommodation request form (
https://careers.lilly.com/us/en/workplace-accommodation ) for
further assistance. Please note this is for individuals to request
an accommodation as part of the application process and any other
correspondence will not receive a response. Lilly is proud to be an
EEO Employer and does not discriminate on the basis of age, race,
color, religion, gender identity, sex, gender expression, sexual
orientation, genetic information, ancestry, national origin,
protected veteran status, disability, or any other legally
protected status. Our employee resource groups (ERGs) offer strong
support networks for their members and are open to all employees.
Our current groups include: Africa, Middle East, Central Asia
Network, Black Employees at Lilly, Chinese Culture Network,
Japanese International Leadership Network (JILN), Lilly India
Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ
Allies), Veterans Leadership Network (VLN), Women’s Initiative for
Leading at Lilly (WILL), enAble (for people with disabilities).
Learn more about all of our groups. Actual compensation will depend
on a candidate’s education, experience, skills, and geographic
location. The anticipated wage for this position is $126,000 -
$244,200 Full-time equivalent employees also will be eligible for a
company bonus (depending, in part, on company and individual
performance). In addition, Lilly offers a comprehensive benefit
program to eligible employees, including eligibility to participate
in a company-sponsored 401(k); pension; vacation benefits;
eligibility for medical, dental, vision and prescription drug
benefits; flexible benefits (e.g., healthcare and/or dependent day
care flexible spending accounts); life insurance and death
benefits; certain time off and leave of absence benefits; and
well-being benefits (e.g., employee assistance program, fitness
benefits, and employee clubs and activities).Lilly reserves the
right to amend, modify, or terminate its compensation and benefit
programs in its sole discretion and Lilly’s compensation practices
and guidelines will apply regarding the details of any promotion or
transfer of Lilly employees. WeAreLilly
Keywords: Eli Lilly and Company, Fishers , Advisory, Data Scientist - CMC Data Products, IT / Software / Systems , Indianapolis, Indiana