You are viewing a preview of this job. Log in or register to view more details about this job.
MS should be in EEE, ECEC, CSI or IT
Data Engineer strong in - Hadoop, Python, JS, KAFKA< SCALA, Data Lake etc 

  • Design and Develop Artificial Intelligence and Machine Learning applications in Big Data ecosystems using spark. 
  • Develop and maintaining the backend Databases and ETL flow for an enterprise portal of Verizon. 
  • Development (coding the actual application), Reviewing and maintaining code, Unit Testing & Integration Testing, Application Build and Deployment. 
  • Develop data pipeline to integrate UI frameworks (NodeJs) with messaging systems (Kafka). 
  • Analyzing existing databases (postgres and MYSQL) to develop effective systems. 
  • Develop store procedures in Oracle and MySQL and optimizing queries to improve frontend performance. 
  • Participate in discussions with clients for analyzing the user requirement for new projects and decide on its possibility of implementation as per the existing technologies. 
  • Develop spark applications for processing batch and streaming data sets using programming languages python, Scala and Java. 
  • Create a synchronizing mechanism from a local server to HDFS using LFTP and python. 
  • Setup and maintain a multi node KAFKA cluster for the production environment. 
  • Analyzing information to recommend and plan the installation of new systems or modifications of an existing system. 
  • Hadoop, Spark Cluster Setup from the scratch for preproduction environment. 
  • Write SQL queries and performed Back-End Testing for data validation to check the data integrity during migration from back-end to front-end. 
  • Use Hive data warehouse tool to analyze the data in HDFS and developed Hive queries. 
  • Implement Spark custom UDF's to achieve comprehensive data analysis using both Scala and python. 
  • Providing POC’s on Core Spark and Spark SQL replicating the existing ETL logic. 
  • Develop many successful ETL flow for the Data movement using python as well as pyspark. 
  • Take up different tasks in the various phases of software development which also includes preparing software specifications and business requirement documents. 
  • Create Dimensional models for the backend tables in MYSQL. 
  • Work extensively in data analysis, wireless systems for developing predictive and forecast algorithms. 
  • Create and pushed data many HIVE external tables with the structured data for the Data Scientists to train their models. 
  • Accumulate the real-time networking data from the provided REST api’s and pushed to the DB using python. 
  • Use the anomaly detection custom function developed by the data scientists as a spark UDF and created an RDBMS table for the UI Dashboards.