The Massachusetts General Hospital (MGH) Neurology Department, an affiliate of Harvard Medical School and a world leader in brain research, is seeking a big data software engineer to join the Clinical Data Animation Center (CDAC). CDAC’s mission is to bring the power of medical Big Data to MGH Neurology. Specifically, CDAC fosters research and clinical trials by providing critical infrastructure, tools and expertise for working with medical Big Data; support clinical process improvements that can benefit from Big Data analytics; and enable highly granular retrospective and prospective studies using MGH’s data archives and via continual capture of new data. Our team is building the data resources, models, algorithms, and pipelines needed to harness the power of medical Big Data in pursuit of CDAC’s mission. We are looking for a big data software engineer who can thrive in pursuit of this mission alongside experts in neurology, neuroscience, statistics, time series analysis, machine learning, and computer science.
The role requires technical, interpersonal and problem-solving abilities. Specifically, you will design and develop the pipelines needed to collect, curate, cross-reference and archive data that is routinely acquired by biomedical devices including continuous EEG data, ECG, continuous physiologic data from bedside monitors, and all available clinical data obtained from bedside data management systems and relevant hospital archives. Examples of the data collected include HL7 messages from bedside monitors (ORU_R01), ADT (Admission, Discharge and Transfer) data, Encounter IDs, patient’s demographics, text notes, procedure and diagnostic codes, trends of vital signs, and other physiologic measurements.
We are seeking someone with eagerness to learn and apply technology, versatility, and breadth in all facets of software development: business analysis, requirements gathering, functional and technical specification, design, development, implementation, testing, deployment, and support of new applications. Ideal candidates must be able to demonstrate an interest in the fields of medical informatics, health sciences, healthcare interoperability, research and in working customers who are world-renowned leaders in their scientific field and the academic environment.
PRINCIPAL DUTIES AND RESPONSIBILITIES:
This position involves a wide range of responsibilities, including:
- Developing and deploying software and apps that interact with clinical information systems and mobile devices, and with existing databases and software systems.
- Developing software for real-time interaction with clinical devices, including EEG recording devices, bedside ICU monitors, and mobile heath monitoring devices.
- Design, create, build, integrate, maintain and optimize multiple ETL data pipelines.
- Aggregate and transform raw data coming from a variety of data sources to fulfill the functional & non-functional requirements (e.g., Microsoft SQL, Apache Hive, Apache HBase, Enterprise Data Warehouse, bedside monitors (HL7), EEG recordings (waveforms), web services, and others).
- Actively Involved in all facets of software development: business analysis, requirements gathering, functional and technical specification, infrastructure definition, data architecture design, development, implementation, testing, deployment, and support of new applications.
- Mentor teammates with less engineering experience and share knowledge of computer science best practices with the team.
- Create and maintain related documentation on Confluence including software especifications, pipeline diagrams, dataflow diagrams, integration schemas, interoperability relationships, etc.
- Refine software development processes and best practices.
Demonstrated proficiency in Python, Apache Spark, shell scripting (Bash) and SQL
Proven ability to meet deadlines and work cooperatively in a small collaborative team.
Demonstrated capability as highly organized, a creative problem-solver.
Demonstrated knowledge of working in an Agile/Scrum environment and familiarity with Jira and Confluence tools.
Ability to adapt to rapidly changing and high-demand environments.
Strong verbal and written communication, ability to write clear technical documentation.
Comfort with probability, linear algebra, and data analysis, and experience processing or analyzing biological data.
Minimal Bachelor’s degree in Computer Science, or related field; Master’s degree strongly preferred.
Experience with Python and at least one more programming language (R, MATLAB, C++, Java, Scala, Spark).
Experience designing, coding, testing and debugging multiple ETL integration interfaces of varying size and complexity.
Experience writing complex SQL statements to extract data from multiples databases.
Experience working with large volumes of structured, semi-structured & unstructured data.
Hands-on experience with Big Data frameworks/Hadoop-based technologies (Spark, Kafka, Hive, Hbase, Sqoop, Ranger, HDFS) is required.
Experience in healthcare industry with a deep understanding of a variety of medical coding systems (e.g.: ICD-9/10, CPT, SNOMED, LOINC)
Experience of Hadoop-based technologies for distributed near-real-time processing
Experience with healthcare interoperability standards including HL7 messaging, DICOM or FHIR
Experience with Azure, GCP, AWS or other cloud providers
Will work within a team of physicians, clinical research coordinators, and physician-scientist neurologists. Work will be performed in various adult ICUs at MGH, in the MGH Epilepsy Monitoring Unit (EMU), and in a computational EEG laboratory space on the MGH main campus.
Massachusetts General Hospital is an Equal Opportunity Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. Applications from protected veterans and individuals with disabilities are strongly encouraged.