Community Partner Highlight - Centro Interdisciplinario en Ciencia de Datos y Aprendizaje Automático (CICADA)

Community Partner Highlight - Centro Interdisciplinario en Ciencia de Datos y Aprendizaje Automático (CICADA)#

This blog post is part of a series highlighting the Catalyst Project’s Community Partners, who are using the Catalyst Project cloud infrastructure to further various projects in the biosciences. Community Partners also play a vital role in shaping our governance model to help us sustain, scale and maximise impact in Latin America, Africa, and under-served communities around the world.

In this blog post, María Inés Fariello Rico shares how partnering with the Catalyst Project is impacting the Centro Interdisciplinario en Ciencia de Datos y Aprendizaje Automático (CICADA) in Uruguay.

Could you introduce yourself to our readers? Tell us about your organisation/institute/project and the research the members of your community are working on right now.

I am María Inés Fariello, a.k.a. Maine, and part of CICADA, an interdisciplinary center researching data science and machine learning. We have several projects that include bio-scientific research, such as ecology, bioimaging, neuroscience, and genomics. We are using the services provided by the Catalyst Project for analysing Uruguayan population data to understand patterns of migration, how much of the native footprint remains, and what can be said about the people who lived in the Uruguayan territory before the arrival of Europeans. We are 5 female researchers using the hub, but the entire CICADA team has around 25 members.

“The Catalyst Project allows us to work in a collaborative way that is easy to use. The trainings are attractive, as they are respectful of the people, no previous knowledge is assumed, and the instructors are welcoming. So, it invites people to participate in the courses. As the knowledge of the courses is so clear and the provided materials too, it is easy to share the knowledge with other people.”

What are you specifically using the cloud infrastructure for? What kinds of data are stored there? What software packages does your community use?

We did a national genome sampling of around 850 individuals (Uruguay has 3.5 million people). For analysing genomes, we perform pre-processing of samples locally, and then once we have the genomes in a Variant Call Format (VCF) we put them in the hub, where we use mostly Python.

Working with the infrastructure provided by the Catalyst Project gives us computing power and makes it easier to collaborate, as it is more comfortable than working in clusters. We have not used the Catalyst infrastructure for education purposes yet, but are planning to use it for a bioinformatics course (human genomics) that is scheduled for October of 2025.

Can you tell us about your experience with the Catalyst Project so far? E.g., have you participated in any training options through Catalyst? What have you enjoyed most so far?

My colleague Graciana Castro and I participated in the 2i2c Hub Champion Training and it was very helpful. In fact, we began to use the hub immediately after that training. All the people involved in the Catalyst Project have been kind and helpful. I am looking forward to participating in The Carpentries Instructor Training delivered by MetaDocencia soon.

A photo of a woman pointing at a laptop showing the Catalyst hub interface.

This photo shows CICADA researchers Lucía and Micaela using the infrastructure to analyse worldwide genomic data.#

Photo courtesy of María Inés Fariello Rico.