Data Pipeline Development
- Design, build, and maintain scalable data pipelines and ETL processes to support various business needs
- Develop and optimize data architecture and data models to ensure efficient data storage and retrieval
Data Management
- Ensure data integrity, accuracy, and consistency across multiple data sources
- Implement data quality checks and monitoring solutions to detect and resolve data issues
Collaboration
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions
- Collaborate with cross-functional teams to integrate data from various sources into the data warehouse
Performance Optimization
- Optimize database performance, including query tuning and indexing
- Implement best practices for data management, data security, and compliance
Technical Leadership
- Mentor and guide junior data engineers, providing technical direction and support
- Stay updated with the latest industry trends and technologies to continuously improve the data engineering practices
Documentation and Reporting
- Create and maintain comprehensive documentation for data pipelines, processes, and data models
- Develop reports and dashboards to provide insights and support decision-making
REQUIRED SKILLS AND EXPERIENCE
- Bachelor's or master's degree in computer science, Engineering, or a related field
- 4-5 years of experience in data engineering or a similar role
- Proficiency in programming languages such as Python, Java, or Scala
- Strong experience with SQL and database management systems (e.g., MySQL, PostgreSQL, Oracle).
- Hands-on experience with big data technologies (e.g., Hadoop, Spark, Kafka)
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and their data services
- Familiarity with ETL tools (e.g., Informatica, Talend, Apache Nifi) and data integration frameworks
- Knowledge of data warehousing concepts and technologies (e.g., Snowflake, Redshift, BigQuery)
- Strong problem-solving skills and the ability to work independently and in a team environment
- Excellent communication skills, both written and verbal
Preferred Skills
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes)
- Understanding of machine learning and data analytics concepts
- Experience with CI/CD pipelines and version control systems (e.g., Git)
- Certification in cloud data services or big data technologies