|
HealthPartners is hiring a Data Engineer. Our mission is to provide simple and affordable healthcare. HealthPartners teams use data to improve patient and member experience, improve health, and reduce the per capita cost of health care. The Personalization team uses complex data to identify specific populations to produce highly personalized marketing campaigns that utilize various marketing channels. This Data Engineer will primarily focus on using SQL and Python to query Databricks tables to create various audiences and segments for outreach. Second to that, they are responsible for building, managing and optimizing data pipelines that facilitate data movement in service of these goals by implementing and testing methods (or build systems) that improve data reliability and quality. They will work in collaborative scrum teams with other engineers, web developers, and campaign owners and may share work efforts in order to achieve campaign goals. They champion and embrace leading practices in the field, and develop processes to effectively store, manage and deliver data. As part of their role, data engineers are responsible for reducing manual data work and improving productivity; they employ and test innovative tools, techniques and architectures to detect patterns and automate common or repetitive data preparation and integration tasks. ACCOUNTABILITIES:
- All team members must champion and model our values of partnership, curiosity, compassion, integrity, and excellence, and must contribute to a culture of continuous learning
- Work with stakeholders, data scientists and analysts to frame problems, clean and integrate data, and determine the best way to provision that data on demand
- Collaborate with other developers to design technology solutions that achieve measurable results at scale
- Help design and develop scalable, efficient data pipeline processes to handle data ingestion, cleansing, transformation, integration, and validation required to provide access to prepared data sets for business partners and web developers
- Utilize development best practices including technical design reviews, implementing test plans, monitoring/alerting, peer code reviews, and documentation
- Collaborate with cross functional team to resolve data quality and operational issues and ensure timely delivery of products
- Incorporate core data management competencies including data governance, data security and data quality.
- Participate in requirements gathering sessions with business and technical staff to distill technical requirement from business requests
- Assist Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability
- Perform other duties as required, to meet team sprint goals
REQUIRED SKILLS/Qualifications:
- Bachelor's in computer science, data or social science, operations research, statistics, applied mathematics, econometrics, or a related quantitative field. Alternate experience and education in equivalent areas such as economics, engineering or physics is acceptable
- 2+ years' experience in a hands-on data engineering role
- 2+ years experience with SQL & Python for data transformation and analysis
- Experience working with Databricks or Spark-based platforms like SnowFlake
- Hands on experience working with large datasets (millions of records, enterprise data)
- Experience transforming data structures (multiple tables, different data sources, relational or semi-structured data)
- Demonstrate understanding working with data formats such as parquet, avro, delta, csv, json etc.
- Demonstrate understanding about data processing techniques like full-batch processing, time-based partitioning, distributed- and real-time processing etc.
- Demonstrate strong data profiling and analytic skills; Ability to discover and highlight unique patterns/trends within data to identify and solve complex problems
- Must be motivated, self-driven, curious, and creative
- Must be a skilled communicator, and demonstrate an ability to work with end users and partners
- Demonstrate the ability to support and complement the work of a diverse development and/or operations team
- Must be flexible and adaptable in fast-changing environments and deadline driven
PREFERRED QUALIFICATIONS:
- Experience with Airflow and Github
- Healthcare data experience, preferably Epic, clarity or claims data
- Knowledge of health care operations
- Exposure to agile/scrum
- Ability to work in a hybrid cloud environment consisting of on premise and public cloud infrastructure. An ideal candidate will have experience with one or more of the following skill sets
- Experience with Relational databases like Oracle, SQL server
- Experience Optimizing and tuning SQL/Oracle queries, stored procedures and triggers
- Experience with Python (numpy, pandas, matplotlib etc) and Jupyter notebooks for exploratory data analysis, machine learning, and process automation
- Experience in areas of CI/CD, Continuous testing, and site reliability engineering.
- Familiarity in Microsoft Azure applications such as Azure Data Factory, Synapse, Purview, Databricks /Spark, Power BI, PowerApps.
- Familiarity in Power BI data models using advanced Power Query and DAX
- Interest and desire to contribute to emerging practices around DataOps (CI/CD, IaC, configuration management, etc.)
|