Machine Learning Scientist I – National Center for Atmospheric Research
Application Deadline: This position will be posted until filled.
Position Term: This position is a two year term position.
Relocation: Partial relocation assistance is available for this position.
Work Authorization: UCAR/NCAR will sponsor a work visa to fill this position.
Cover Letter: A cover letter is required for this position.
Where You Will Work:
NCAR’s Computational and Information Systems Laboratory (CISL) is a leader in supercomputing and data services necessary for the advancement of atmospheric and geospace science. CISL’s mission is to remain a leader at the forefront of ensuring that research universities, NCAR, and the larger atmospheric, oceanographic, and related research communities have access to the computational resources they need for their research. To fulfill the need for a stronger workforce at the intersection of High Performance Computing (HPC) and geoscience problems, CISL engages in education and outreach activities to inspire and attract a diverse future workforce.
What You Will DO:
The Machine Learning (ML) Scientist will conduct ML research applied to challenging problems in the Earth system sciences as part of the Analytics and Integrative Machine Learning (AIML) Group in the Technology Development Division (TDD) in the Computational and Information Systems Laboratory (CISL) at the National Center for Atmospheric Research (NCAR). The incumbent will have a proven ability to innovate, by creating novel ML algorithmic approaches to problems in earth systems science or related physical science disciplines.
Relevant ML application experience may include the use of ML training and inference systems for the recognition, prediction, or tracking of important features or events in datasets, or alternatively, through the auto-encoding of suitable physics parameterizations in Earth system models with neural networks, the replacement of model components with efficient, learned emulators. Machine learning techniques may also be applied by the scientist to automate or accelerate the human data analysis of hundreds of routine data products, thus amplifying the scientific capability of researchers, the integration of non-traditional data sources into Earth system prediction systems to help optimize supercomputing workflows through ML-guided resource management, or for the early detection and steering of numerical simulations.
The position works on ML projects as part of a team of ML scientists, data engineers and students in the AIML Group in NCAR’s Computational and Information Systems Laboratory (CISL). AIML collaborates with other laboratories across NCAR, and potentially with external physical and data scientists as well. The successful candidate will develop appropriate ML systems focused on, but not limited to, two initial projects:
Emulating complex models
Current chemistry-climate models cannot represent the complex chemistry involved in the degradation of hydrocarbons emitted by anthropogenic activities, due to the enormous number of species and reactions. ML emulation may produce less costly reduced models, providing new opportunities to investigate their impact on human health and air quality.
AI for edge computing.
Many observational instruments are producing massive amounts of data that requires extensive processing at the edge, that is, before it can be used by NCAR’s HPC infrastructure. For example, the Holographic Detector for Clouds (HOLODEC) is an airborne instrument that gives an unrivaled view of 3-D distributions of droplets, providing an unprecedented accuracy and detail of cloud physics data. However, analyzing the huge data volumes produced by the instrument with current techniques presents a bottleneck, limiting the instrument’s scientific utility. Image-based ML methods could accelerate the analysis process and help advance our understanding of cloud processes.
For each project, the scientist will work with the team, and in collaboration with domain scientists, to create and share the necessary training datasets, and apply, tune, evaluate and verify a variety of machine learning approaches to solving these problems.
The ML scientist’s efforts will be built on top of NCAR’s core capability in domain-focused statistical development, and will leverage its vast observational and model output datasets, CISL’s petascale supercomputing infrastructure, and cloud-based resources and environments.
The position will require the ability to work in teams and across disciplines in order to cross-fertilize ideas and build strong collaborations to tackle Earth system science challenges. This integration with and support from colleagues in the Earth system sciences will help to ensure the relevance and sustainability of the ML project scientist’s research activities.
The position provides high-level ML expertise to these projects, assists in planning the project’s human and financial resource requirements, and will participate in the evaluation of the project’s progress, its results, and make adjustments to the project’s approach to better achieve objectives.
The ML Scientist may also serve, from time to time, as a consultant to internal staff and external organizations on ML topics.
Communication of Results
The scientist participates in mission-relevant academic activities including conferences, workshops and tutorials. Documents research results by authoring peer-reviewed conference and journal publications, and publicizes those results in presentations at scientific meetings. As a subject matter expert, the scientist helps develop grant proposal concepts, teams and text, and may be called upon to lead proposals as a principal or co-principal investigator.
The position responsibilities will also include an education, training and outreach component. An ML short course developed by the team has now been conducted at a number of venues. The scientist will help support and further develop this material, as well as initiatives to increase the organization’s training capacity in data-centric science, including ML. The scientist serves as a reviewer on scientific papers and proposals, or on conference organizing committees.
What You Need:
EDUCATION & EXPERIENCE
Ph.D. in Computer Science or in a physical science discipline which uses machine learning and at least two years of post-graduate experience in the scientific field of specialization; or an equivalent combination of education and experience.
Knowledge, Skills, and Abilities:
- Excellent communication skills in presenting scientific research, and writing papers in scientific journals, technical reports and proposals.
- Ability to work and communicate with an international and multidisciplinary team.
- Advanced knowledge of two or more machine learning algorithms and the supporting mathematics.
- Knowledge of at least one deep learning framework, (e.g.TensorFlow, Keras, PyTorch).
- Knowledge of at least one general machine learning framework (e.g. scikit-learn).
- Able to use statistical methods to evaluate the performance of Machine Learning models.
- Experience mentoring and working with students. May supervise the work of others, including project staff.
- Familiarity with high performance computing environments.
- Solid Python programming skills. Skills using the scientific Python stack and Jupyter notebooks.
- User level familiarity with Linux and Unix-based tools for scripting and file manipulation.
- May write funding proposals and reports to funding organizations. May be a PI or Co-Principal Investigator with a member of the Scientific or program staff.
- May author scientific reports and publications and give presentations at scientific meetings.
- Represents the organization in providing solutions to difficult technical issues associated with specific projects.
- May assist in mentoring, supervising, training and/or directing the work of others. Ability to perform occasional/infrequent travel, if required.
Desired but not required :
- These knowledge areas are ranked in order of desirability from highest to lowest
- Experience in computational earth system science and/or in atmospheric science is highly desirable.
- Ability to understand and modify code written in any one of C or C++, and Fortran 90 is desirable.
- Familiarity with high performance computing environments would be a plus.
- Familiarity with the use of commercial cloud services would be helpful.
Notes to Applicants:
- A Cover Letter is required for this position. Please upload your cover letter in the Resume/CV upload box
- An Inclusion Statement will be required for all applicants advancing to an in-person interview. If requested, this statement should address past efforts, as well as future vision and plans to advocate for and advance diversity, equity, and inclusion in the organization and/or field of work.
- A pre-employment screening is conducted in conjunction with an offer for employment. This screening may involve verifying or reviewing any of the following relevant information: restricted parties screening, employment verification, performance records of internal candidates, education verification, reference checks, verification of professional licenses, certifications, and Motor Vehicle Records. UCAR complies with the Fair Credit Reporting Act (FCRA).