Role Summary
A Data Engineer (DE) in an small IT team, are mainly responsible for designing, building, and managing the infrastructure and tools that allow for efficient processing and analysis of large data sets. The role involves designing solutions that include data, data intelligence, data aggregation and reporting. The role involves understanding the disparate data sources in the organization that are on-premises/cloud bespoke/COTS systems, analyze the data schema and design the data elements that needs to be continuously extracted into a central repository for data analysis. DE will work closely with internal teams (graphic designers, UX Leads, business analysts, operations, business users) in the organization during the development life-cycle of an application. DE would need to ensure the applications are built with quality code, collaborate well with design elements, integrate with on-premises / cloud systems, bespoke / COTS applications, able to scale well with load, and perform efficiently. Experience in health care and terminologies would be a plus.
Responsibilities
- Develop source codes for data ingestion from source systems, data transformation using synapse, Databricks-Py Spark and other related component, incorporate the business logic in the transformation layers
- Managing and designing the cloud components for extraction, transformation, and loading of data from a wide range of data sources
- Building Data models in Azure Data Lake, cleansing and anonymizing data, making available the consolidated data to the presentation layers, power BI and/or external applications
- Understanding the ERD model of different on-premises and SAAS based systems.
- Liaise with the Data Scientists, Architects, software developers, and business analysts/users to understand how data needs to be converted, loaded, processed and presented
- Experience in designing data driven decisions backed through BI tools and Data
- Write technical design documentation
Key Skills
- 8+ years of experience as a data engineer/analyst or similar role.
- Experience in data ingestion, cleaning and processing tools, preferably using Azure Data Factory and Azure services.
- Design, develop and deploy programs, source codes, batch scripts, complex SQL stored procures, functions and triggers.
- Experience in data ingestion pipelines integration, data engineering/modelling and build of large-scale Data Lake (Azure, AWS), Databricks-PySpark, Synapse analytics, Master Data Management, ETL pipelines
- Experience in MS SQL Server, MS SQL Reporting Service (SSRS) and SQL Server Integration Services (SSIS)
- Experience in end-to-end design, coding, testing, review and implementation using Power BI or similar service offerings.
- Develop spreadsheets, diagrams and process map to document needs
- Familiarity with graphic design, data visualization and user experience.
- Experience with Python, JavaScript will be plus.
- Experience with database technology, RDBMS and No-SQL.
- Experience in performance optimization and tuning.
- Contribute effectively in a fast-paced, deadline-driven, and collaborative programming environment
- Stays up to date with current trends, best practices and new technologies
- Knowledge & practices of SDLC process and agile methodologies