Data Engineer/Data Scientist - Validate Health

Chicago, Illinois, United States · Engineering · FULL TIME


Interested in being part of a small founding team, so you can see your direct impact on improving the healthcare industry? Want to be one of the rockstars building an innovative product from the ground up?

Validate Health is an early stage healthcare analytics company on a mission to improve accessibility to healthcare by enabling medical organizations to operate at stable and sustainable financial models. Validate was founded by a healthcare actuary from the largest medical records vendor who became a prominent thought leader in the transition to value based care and an alumnus from the U.S. Department of Health and Human Services serving as the healthcare data liaison between government and industry. Validate is now building its analytics platform that encompasses the accumulated wisdom of its experts and clients, in order to empower medical organizations to manage their own financial risk, while improving the clinical outcomes of their patients. We're looking for talented and driven contributors to join our team and be a part of this important moment in the healthcare industry.


This position is a versatile combination of Data Engineer and Data Scientist roles. You’ll get to play a key role in shaping the delivery of powerful data-driven products that enable sustainable value-based healthcare models.

▪ Leverage the main toolsets: Python, Anaconda stack (Jupyter Notebook, NumPy, Pandas, MatPlotLib/Bokeh, SciPy, Scikit-Learn), Postgres and several AWS services (EC2, RDS, S3, Lambda, Redshift) to...

▪ Automate the processing of patient-level healthcare transactions, third party data sources and aggregated public health data

▪ Ingest, transform, cleanse and augment internal and external data assets. Build algorithms for fuzzy matching, de-duplication and rule-based de-identification.

▪ Implement mathematical models using Python data science tools. Generate automated simulations and forecasts of large number of scenarios. Fully indulge your love for math, statistics and logical problem solving.

▪ Generate insightful and innovative visualizations and output reports that tell a “data story” and support your findings.

▪ Perform data modeling and performance optimization on relational databases. Write SQL for defining database objects and performing manipulations.

▪ Continuously learn by investigating and adopting new technologies. We love to experiment with cool technologies!

The right candidates brings aptitude, attitude, curiosity and grit that's shines in a startup environment. If you're more into an established corporate environment, this is not the role for you.



The ideal candidate would have:

▪ Computer Science or Mathematics degree from a respected university program. Masters preferred or Bachelors with history of accomplishments.

▪ Two experience levels available:

— 5+ years of full-time experience or demonstrated accomplishments in relevant subject areas.

— Recent university graduate with Computer Science or Mathematics degree from a top tier, with strong aptitude for these subject areas and demonstrated portfolio of relevant projects.

▪ Demonstrated mastery with Python and its data science ecosystem. (Knowledge of other mathematics environments such as R, SAS, SPSS or Matlab is a plus.)

▪ Ability to design and build scalable data ingestion and processing automation from the ground up using open source languages, libraries and frameworks.

▪ High level of expertise in SQL, relational database optimization, stored procedures and data modeling.

▪ Mastery of Linux command line, shell utilities and Git / GitHub.


Extra credit if you also have any of the following:

▪ Experience with scalable cloud data services. AWS RDS and Redshift are preferred, but Azure or GCP are good too.

▪ Understanding of methods to ingest and process non-relational JSON and XML formatted data.

▪ Ability to create and serve up APIs using Python and Flask, as well as integrate with 3rd party REST services.

▪ Understanding of job scheduling frameworks and workflow automation.

▪ Familiarity with concepts used by ETL tools (such as SSIS, Informatica and Talend) is a plus, but an ability to create more purpose-built solutions by leveraging open source tools.

▪ Knowledge of test driven development practices.

▪ Proficient at object oriented programming.

▪ Desire to be an expert in healthcare and passionate about making an impact in this field. Experience with healthcare claims and clinical data is beneficial. Understanding of HIPAA compliance is a plus.


▪ We offer stock options, salary that grows with the company and health coverage.

▪ You also get free lunch, coffee, drinks and beer. Friday happy hours.

▪ Fun, energetic and rewarding learning environment.

▪ Conveniently located in the Merchandise Mart in the Loop.


▪ Send in your latest resume to [email protected] We will only consider candidates who apply this way.

▪ Write a note explaining what makes you particularly interested in Validate and this position specifically.

▪ Feel free to include any links that you feel speak to who you are and your capabilities, such as to LinkedIn, GitHub, publications, blog or portfolio.

▪ Add “VH Data Engineer / Data Scientist via 1871” to subject line.

We value diversity and are an equal opportunity employer.

