We are seeking an intermediate software developer with good experience and great passion for developing big data systems that will change the very landscape of medical science.
You, our ideal candidate, are looking for a small and dynamic multidisciplinary team of researchers and engineers who work together in an agile fashion. You are prepared to bring your drive, your experience, and your passion to contribute at all levels to the entire team. We, Indoc Research, are that team. We build and manage complex health research infrastructure for collaborators and clients. We bring together prominent research organizations across the country and internationally. Together we have created large scale informatics platforms involving diverse and complex data modalities (e.g. imaging, genomics, clinical assessments) across multiple disease areas (e.g. neurodegeneration, depression, cancer).
You will have the opportunity to stretch and develop your data handling design and implementation skills to a whole new level. You will imagine, help design and implement whole new ways to bring multidimensional data from an endless variety of sources into a collection of platforms. How to efficiently manage, link, integrate, federate not only gigabytes but terabytes of information? How to query and navigate highly-structured and highly-unstructured data? How can we shape international collaborative efforts towards consistent lexicons and adapt to emerging ontologies? And do it all efficiently, reliably, and accurately? These are the questions that you will help us answer.
With strong skills in data handling and representation, you will be responsible for helping create futuristic yet realistic systems for use by surgeons in the midst of procedures, researchers accessing remote data across the globe, doctors trying to understand the genetics of their patient even at the bedside, and patients and the public seeking insight into their own maladies and conditions. Your creativity and innovation, combined with the multidisciplinary skills of the rest of our team, will help deal with security and privacy even while enabling fusion of high dimensional data across multiple medical modalities as diverse as MR imaging, molecular science, and psychological assessments. Your work will help address critical gaps and fulfill currently unmet and urgent needs in both clinical and research communities, handling data from distributed settings such as critical care units, clinical laboratories and hospital imaging facilities.
As a software developer on these projects, you will have a critical role at all levels of design and implementation, architecture and testing, deployment and support. And you will be building the future of medical science.
- 2+ years of programming experience in Python. Knowing Java would be an asset.
- Good work experience with Elasticsearch, Spark SQL.
- Familiar with modern data management systems, including RDBMSs, Redis, and / or MongoDB.
- Experience with version control system such as Git or SVN.
- Solid experience with application deployment in a UNIX/Linux environment and Docker ecosystems.
- Strong personal research capabilities and the ability to learn new technologies/products quickly.
- Experience with:
- High throughput data ETL pipelines using Apache Kafka, Logstash, or other message queuing systems
- Big Data ecosystems (Spark framework, Thrift, Hadoop, HBase)
- Experience in lexical and ontological technologies (such as the semantic web)
- Experience in RDF, XML, SPARQL and related technologies
- Strong written and oral communication skills
- Track record of initiative and self-organization with strong time management skills
- Willingness and ability to work on multiple projects at the same time
- Demonstrated ability to work within a collaborative team across multiple disciplines
- Bachelor’s degree or equivalent in Computer Science, Software Engineering, or equivalent