You are viewing a preview of this job. Log in or register to view more details about this job.
What You’ll Do:
  • Design, Build & Implement
  • Design & build batch and real-time data pipelines using some of the latest technologies available on Microsoft Azure, AWS and GCP
  • Implement large-scale data platforms to meet the analytical & operational needs across various organizations
  • Build products & frameworks that can be re-used across different use-cases in increase efficiency in coding and agility in implementation of solutions
  • Build streaming ingestion processes to efficiently read, process, analyze & publish data for real-time need of applications and data science models
  • Perform analyses of large structured and unstructured data to solve multiple & complex business problems
  • Investigate and prototype different task dependency frameworks to understand the most appropriate design for a given use case to assess & advise
  • Understand business use cases to design engineering routines to affect the outcomes
  • Review & assess data frameworks & technology platforms with the goal of suggesting & implementing improvements on the existing frameworks & platforms.
  • Understand quality of data used in existing use cases to suggest process improvements & implement data quality routines
 
 
Who You Are:
  • An Engineer interested in working in both streaming and batch processing environments
  • A tech-enthusiast excited to work with Cloud Based Technologies like GCP, Azure, Kubernetes
  • A doer who loves to produce meaningful analytic insights for an innovative, data-intensive products
  • Always curious about analytics frameworks and you are well-versed in the advantages and limitations of various big data architectures and technologies
  • Technologist who loves studying software platforms with an eye towards modernizing the architecture
  • Believer in transparency & communication
 
The Tools We Use
Tools can be learned, so please don’t shy away from applying if you’re a general strong engineer! To give you a flavor of our current tools:
 
Language: Python, Scala, Java, SQL
Streaming: Spark Streaming, Pub/sub, Kafka
Database software: Snowflake, Azure Data Factory, Databricks
Cloud Technologies: Azure (HDInsight, data factory), AWS, GCP (BigQuery, Cloudproc, Cloudflow, Compute Engine)