In many ways, the NORP bootcamp on Arctic Processes in CMIP6 was about making connections. Connections between different processes and components of the climate system; connections between different types of data – in-situ observations, climate models, satellite remote sensing, ice and ship based measurements; connections between people and connections between computers.
We called this event a bootcamp, to emphasise the short but intense nature of it, but you could also call it a summer school, or perhaps more appropriately a hackathon as the idea was not just provide additional education, but also to extract additional value out of the CMIP6 global climate modelling dataset for the Arctic. The idea came even before the CMIP6 archive was filling up as we became more aware of just what a challenge the coupled model intercomparison project (CMIP) is starting to pose to climate researchers. The petabytes of data produced by around 40 international modelling centres, completing many and varied different climate experiments, along with the core DECK experiments, represent data gold. However, extracting that gold to improve our understanding of the Earth system requires a lot of work and careful thought. Here the application of big data tools can also really help. The bootcamp programme therefore also aimed to help users learn about big data processing that can hopefully be applied in multiple contexts. We chose to use, pangeo, which is perhaps best described as a community developing open tools and infrastructure for big data geoscience research, gathered together in a single platform.
The topic of Arctic processes in the CMIP6 was chosen by the Northern Oceans Regional Panel, who all participated in different aspects of organising the bootcmap. However, this format was very successful and could easily be applied to other themes and/or geographic locations.
The final motivation for organising the bootcamp became more important over the course of the pre-organisation as the Covid19 pandemic raged. We had to postpone the bootcamp twice due to travel and meeting restrictions, moving the location also from the original planned base on Helgoland, to Søminestation, a research station in Denmark. It became clear during that time that early career researchers (ECRs) were particularly affected at a key stage in their career when networking and collaboration are important. We felt therefore that it was important for ECRs to meet each other but also with a range of different senior scientist mentors and lecturers. Although other similar events were held as online-only activities, we felt the in-person element would be crucial for building enduring and productive collaborations among international participants. To further this, we arranged a social programme with activities like a pub quiz, morning yoga and a “save the glaciers” escape room type of game to further help students and mentors to work together informally.
The efficient data processing aspect of the bootcamp was greatly helped by the participation of Tina Odaka (in person) and Anne Fouilloux (online) of IFREMER. Tina's presence at the bootcamp was actually essential and a large part of the success of the bootcamp projects is down to her help in tutoring through the tools, uploading the missing datasets needed and helping out with ad hoc requests. In addition, Anne and Tina provided free access to the European Open Science Cloud, via their Foss4G project, which meant that processing and the big data aspects of the bootcamp could all be done using cloud computing. This made the processing of data considerably more efficient and has hopefully helped to raise the profile of open science solutions like pangeo.io and the EOSC as well as helping ECRs and mentors alike to increase the efficiency of using large datasets in the future.
Given the pandemic background, all participants were asked to Covid test before travel and every other day during the bootcamp (fortunately we had no positive results) and we asked (but did not check), that all were vaccinated before attending. Regular ventilation of the work room and encouraging walk-and-talk discussions outside in the forest were also important precautions. We provided masks and hand disinfectant for those who needed it, but did not insist on using them while working.
The daily programme varied a bit but included a slot for student presentations on their own research field, one or two lectures from senior scientists (some of these were delivered online, some from mentors present at the bootcamp) and a large piece of free working time on the bootcamp technical projects. In the first week we had technical lectures and practicals on using pangeo and different techniques for data processing, as well as some lectures giving a grounding in different elements in the Arctic climate system, including, ocean, atmosphere, sea ice, ice sheets and glacier and climate modelling, including the use of large ensembles. Other topics included Arctic links to mid-latitude climate and the use of remote sensing data.
The main aim of the bootcamp was to work on technical projects in small groups, with the assistance of senior scientists who acted as mentors to the groups. Given family and travel obligations, most mentors only attended in person for part of the bootcamp with some also being partly or wholly online. This meant that some groups had 2 different mentors at different times, or had part online mentoring and part in-person.
Given the short but intense nature of the event we found that most groups managed a lot of very detailed technical work, but follow-up afterwards was necessary to turn this into publications and/or presentations. The role of the mentors here has been important to keep the momentum going, as after a sprint of this nature, many of the bootcamp participants were quite tired and then returned to their own research projects. The experience and organisation of a senior scientist is helpful here in keeping the group together and focused on the end goal, though it should be noted not all groups have needed this.
The group projects are listed below. We expect at least three of them to produce publications, they have all been presented at international conferences (EGU, Mosaic science and AGU), and the ECRs involve report that they are well on the way to submitting drafts. Other projects have been subsumed into PhD work already planned by ECRs, and in this sense the network effect of mixing ECRs who have different specialisms has-been very effective in widening already existing research.
The ECRs at the bootcamp (listed in the appendix) were selected by NORP members from applicants to cover a range of different disciplines, career stages and countries. This included MSc students, PhDs and postdoctoral scientists covering subjects ranging from global climate modelling, to in-situ field observations, to satellite remote sensing. In this way, we found that different participants had different skills and could support each other in learning and applying new knowledge. We received rather few applications from the global south, but we prioritised these (as long as the applicant was studying something relevant) and also strove for a good gender balance. Thanks to funding from CLIVAR, administered by the WMO, we were able to offer travel funds to all students who requested them. This was the largest expense in the project, but we considered it important to mix students from many places who may not otherwise have the opportunity to meet. All of our mentors came from European institutions. This was largely to reduce the travel budget, and carbon footprint of the bootcamp as much as possible but we were able to host online lectures from scientists in North America.
We encourage other groups to consider this format, it’s a powerful and productive way to kick-start new collaborations and to get deep, focused work done. At the same time we have some learnings to share from the experience:
Firstly, the pangeo platform is very powerful and a useful place to start when dealing with multiple models, but it was also new to most participants, including the senior scientists. It took quite a lot of time to get ECRs up and running. Given this, it would be helpful to have some of the introductory material delivered beforehand and future bootcamps need to carefully assess what kind of cloud resources they need and where to source them. The EOSC is a fantastic resource for this and pangeo is already there, but applications to use will need to be submitted well in advance.
Secondly, the most effective projects were those where the senior scientist mentor had prepared well in advance, working out which datasets would be helpful and could already at the start outline the project effectively to the ECRs at the start, or even in advance on the bootcamp slack workspace. In some cases, the ECRs themselves came up with specific project ideas beforehand and worked on these with others. These were also generally very successful projects and the bootcamp thus acted as an incubator for an idea that may not otherwise have been explored. Mentors all reported enjoying the experience, but the commitment required should not be under-estimated to see the projects presented/published to the end. We also found it helpful to assign two mentors to groups where a mentor could only attend part of the bootcamp.
The use of online tools such as slack and GitHub to organise, communicate and save or share code were extremely helpful at all stages. We used these to organise the early stages, to communicate with the ECRs and the project channels within the workspace continue to be used as projects continue towards publication. Given slack is a commercial proposition and free accounts have disappearing messages, in future bootcamps we may consider using tools like self-hosted mattermost instead.
Finally, the work in organising an event of this nature should not be under-estimated. A helpful administration will greatly reduce the difficulties of receiving and paying out funds, a committee to plan and review applications is essential. A local on-the-ground organisation is also helpful for ironing out annoying details.
We thank all the mentors very much for their engagement, the ECR participants for their enthusiasm and hard work, the funders for their trust in disbursing the funds that made it possible and the administration at DMI for helping to disentangle the different invoices and the on-the-ground organisation. The connections we made at this bootcamp will continue to bear fruit for a long time.
- Open science tools like pangeo are very suitable for processing big datasets using cloud computing solutions like the European Open Science Cloud. Arctic science and climate science in general will benefit from more cross-collaboration with the open science/big data community
- CMIP& models have a large number of biases that make it challenging to be confident about some impacts of climate change in the Arctic. We highlight some necessary improvements aimed at CMIP7
- A brief intense focused work sprint is a useful method for solving difficult problems in climate science. The format allowed early career researchers to build networks, get experience presenting their work and to promote international collaboration with each other and with more senior mentors
Date and Location:
11. - 21. October | Søminestationen, Denmark
IASC Working Groups / Committees funding the Project:
- Atmosphere WG
Year funded by IASC