Actual Job Title: Postdoctoral Fellow - Natural Language Processing (NLP) and Artificial Intelligence (AI)
50% onsite 50% remote
Responsibilities:
- Utilizes supervised or unsupervised methods, learning from vast amounts of unlabeled data to drive insight from unstructured text.
- Ensures life cycle management of code is maintained through version control and associated repositories.
- Develops high quality analytical and statistical models, insights, patterns, visualizations, that can be used to improve decision making in manufacturing operations.
- Responsible for documentation of all technical work both within and outside of formal document management systems.
- Independently develops code and analytical models to automate data transformation and analysis.
- Performs data engineering, preprocessing, exploratory data analysis, and model development by interacting with a variety of databases.
- Responsible for ingestion, integration and delivery of data across multiple platforms.
- Works to maintain and uphold data integrity and clean data principles.
- Communicates with team members regularly to provide updates and collaborate on deliverables.
- Displays a high level of teamwork and collaboration both within and across functions.
Requirements:
- PhD with knowledge of deep learning methods for NLP (quantitative area of study, Computer Science, preferred).
- Strong background and demonstrable experience in Natural Language Processing and Computational Linguistics is required.
- Proficient in writing and developing analytical and machine learning models using python modules including pandas, numpy, scikit-learn, and tensorflow. Experience developing and implementing MLOps pipelines.
- Experience building analytical and statistical models to answer key business questions.
- Experience using git via the command line.
- Strong understanding of core statistical concepts to solve real world problems.
- Intermediate to advanced proficiency (3+ years post academia experience as an independent contributor designing and delivering data solutions) in SQL.
- Experience interacting with various data warehouses and large-scale, complex datasets using ETL and BI tools and platforms.
- Self-motivated to identify and propose Client methodologies that will drive increased efficiency.
- Demonstrate expert knowledge in machine learning and rule-based systems as applied to computational linguistics and natural language processing, as well as development and execution of annotation tasks with teams of experts.
- Proficiency in mathematics with the skill to translate complex mathematical algorithms into usable computational methods.
- Experience with data mining and analysis techniques across disparate data sources.
- Experience working in LINUX/UNIX environments.
- Experience interacting with PostgresSQL, Oracle, Impala Cloudera, Okera or similar databases.
- Experience with JupyterLabs, Anaconda, and RStudio.
- Intermediate proficiency with python.
- Experience developing visualizations using a variety of methods (plotly, matplotlib, seaborn).
- Experience working within Domino Data Lab projects.
- Technical knowledge of performance tuning and query optimization across large data sets.
- Experience with data cataloguing and enablement through APIs.
- Experience with a variety of computer science languages (C++, Java, html/css).
- Exposure to bioprocess engineering/cell therapy data.
- Knowledge of GxP requirements (preferably related to data and code management).
Preferred:
- Dashboard development experience (Tableau, Spotfire, DASH).
- Experience working with the pharmaceutical industry.
#J-18808-Ljbffr