You are viewing a preview of this job. Log in or register to view more details about this job.

Senior Data Engineer

At Tagup, we use machine learning to make the machines that power the world safer, more reliable, and more efficient. As a rapidly scaling AI/ML technology company, we are looking to expand our team.

The Senior Data Engineer will be a core technical contributor to Tagup’s data engineering team with deep expertise in creating and manipulating large, complex datasets that feed central data warehouses for Tagup's data science, product, and engineering teams. The Data Engineer will be responsible for standing up and maintaining data pipelines, building computed tables and database structures, identifying data integrity issues, and data management at Tagup. The Data Engineer will also work closely with Tagup project managers, data scientists, and ML engineers to help prep data for models and dashboards. 

Responsibilities

  • Build highly reliable computed tables (including unstructured data like video and audio) combining and transforming data across multiple sources, including Tagup sensor data, customer metadata, and financial data
  • Use Python to access, manipulate, and join external datasets to internal data (e.g., via REST APIs, Pyspark, SparkSQL)
  • Ensure very large databases and compute clusters operate optimally and enable data science and software engineering teams
  • Implement and maintain database structures and governance
  • Develop / maintain data management at Tagup (including scalable systems to document metadata)
  • Work closely with stakeholders across the company from product engineers, data scientists, customer support, finance, and more, to build data pipelines that solve business needs
  • Assist our data science team by building robust data annotation, training, and inference pipelines


Desired Skills

  • BA / MS degree in Computer Science, Statistics, or related discipline
  • Experience in data engineering focused on ML / data science and ML operations
  • Experience with standing up ETL pipelines to handle massive volumes of data
  • Experience processing and manipulating data very large data, preferably in Python (e.g., with PySpark)
  • Strong proficiency in SQL, Python, and working with REST APIs
  • Knowledge of software engineering fundamentals; high level of comfort reading and understanding full-stack / backend development code (e.g., our Python code base)
  • Familiarity managing code via GitHub or other code versioning tool
  • 4+ years experience as a data engineer or data-focused Software Engineer


Bonus Points

  • Some experience with data visualization, preferably in Tableau
  • Experience with distributed machine learning
  • Some experience with time series based data

As a fast growing technology company, we offer all members of the team part-ownership through an Employee Stock Option Plan. We also offer health insurance benefits, 401(k) fund options, and encourage a team-oriented work environment with regular company outings.

Tagup is an equal opportunity employer and individuals seeking employment with us are considered without regards to race, color, religion, national origin, age, sex, marital status, physical or mental disability, veteran status, gender identity, sexual orientation, or any other characteristic protected by law. 

To all recruitment agencies: Tagup does not accept agency resumes. Please do not forward resumes to our jobs alias, Tagup employees or any other organization location. Tagup is not responsible for any fees related to unsolicited resumes.