Share this Job

Senior Software Engineer - Data Engineering

Apply now »

Date: Apr 5, 2019

Location: Dublin, L, IE, D02 R590

Company: Houghton Mifflin Harcourt

Job Requisition ID: 14879

Additional Locations: 

  • We're based in Trinity Central, 152-160 Pearse St., Dublin 2.
  • Our city centre offices are near the back entrance to Pearse Sreet Dart station & beside Tesco Express on Pearse Street. We’re close to all major transport links, coffee shops/restaurants/gyms and minutes from Stephen’s Green and the city centre.

Who we are:

At HMH, our learning platforms serve millions of learners and educators. We have interesting problems to solve, including how students interact with content from our learning platforms, how it impacts learning outcomes, and how to provide data driven guidance to maximize learner’s potential. How we gather, store, process and analyse data is key to effective decision making for our business and our customers.

About the team your joining:

  • You will work closely with our data science, learning science, development, DevOps and product teams, to develop robust data pipelines and analytics capabilities for HMH and our customers.
  • You will be joining a technology forward company.
  • We’ve invested heavily in a wide range of both open source and AWS data tools, to solve complex big data questions (e.g. building a data lake using S3, designing a new Enterprise Data Warehouse using Redshift, processing events using Kafka and automating batch and streaming pipelines using Spark etc).

Not only that, but we’ve hired some of the best Data Engineers and developers for you to work alongside. Our teams are driven by a passion for technology, a love for high quality code, and our mission to transform learners’ lives. Are you?

Primary responsibilities:

Here’s the kind of problems you will be helping us solve;

  • Working with Spark to perform batch and real time events processing and gather data from other part of the business to populate our data lake.
  • Design and implement Data Pipeline solutions.
  • Using Kafka to stream large data sets (which support our millions of users).
  • Work with our Data Science and Development teams to implement our data lake using AWS (Glue ETL).
  • Improve our APIs using Spring Boot.
  • Working with our DevOps teams to create the right environments (using Mesos, Docker and Terraform).
  • Troubleshoot issues over different technologies such as Spark, Kafka, AWS apps.

Knowledge & Experience

Here’s what you’ll need to be successful in this role:

  • Bachelor's degree in computer science or related field is preferred.
  • Experience working with Apache Spark on batch and streaming jobs.
  • Solid experience producing and consuming large datasets on Kafka.
  • Experience working with Java and Spring Boot (API creation).
  • Experience working with big Data on AWS (EMR, S3).
  • Knowledge of Big Data architecture patterns.
  • Experience with docker, containerisation and Apache Mesos (nice to have but not essential).


Houghton Mifflin Harcourt (NASDAQ:HMHC) is a global learning company dedicated to changing people's lives by fostering passionate, curious learners. As a leading provider of pre-K-12 education content, services, and cutting-edge technology solutions across a variety of media, HMH enables learning in a changing landscape. HMH is uniquely positioned to create engaging and effective educational content and experiences from early childhood to beyond the classroom. Follow HMH on Twitter, Facebook and YouTube.

Job Segment: Social Media, Marketing, Publishing, Education