You are viewing a preview of this job. Log in or register to view more details about this job.

Data Engineering Summer Intern

Role overview
The Data Engineering Intern, reporting to Director of Software Engineering, Data Platform team, is responsible for ingesting data from a multitude of sources from various clients via data file integrations, standards-based interfaces and API integrations. Data feeds today include patient demographics and eligibility data, as well as a full set of clinical data, such as patients’ lab or diagnostic results, diagnoses, procedures, and medications. In addition, we also leverage NLP tools and techniques to extract insights from unstructured data sources such as provider notes, and patient imaging reports. In future, we could potentially explore data feeds from wearable and implantable devices, home monitors and genetic information.
Additionally, the data team is responsible for setting up and facilitating analytics by designing, maintaining, and keeping in sync an analytics data warehouse
A day in the life…

  • Data Store - Maintenance and keeping our internal data model and storage up to date. Includes designing new tables for new concepts needed by clinical. Work with cross team members (specially clinical) to keep our data model up to date
  • Data Integration - Implementation of new data load, transformations
  • Data Analytics - Build and maintain an analytics database. Execute analytics reports/dashboards for data processed, and other internal insights
  • Data NLP - Build out tools for processing data out of unstructured text from clinical reports such as cardiology reports
  • Data APIs - Design and implement API for internal data team use to execute jobs, trigger 3rd party tools, and produce extracts/reports.
  • Data Reference - Build and maintain, enhance/implement automated data workflows for a centralized reference store from feeds from outside vendors.
  • Testing/Monitoring/Support - All engineers would be expected to be involved in designing and implementing tests for the data platform, investigating, and debugging issues reported by clinicians, and designing and implementing monitoring alerts
What we are looking for…

  • Enthusiastic about data and eager to learn about data in healthcare
  • Proficient in python
  • Experience with both NoSQL and RDBMS databases
  • Experience interacting with cloud infrastructure (AWS preferred - but GCP or Azure ok)
  • Familiarity with shell scripting and one form of python notebook.