Skip to content

When you choose to publish with PLOS, your research makes an impact. Make your work accessible to all, without restrictions, and accelerate scientific discovery with options like preprints and published peer review that make your work more Open.


How creative storytelling helped me learn and communicate data science

Creative storytelling can help with the learning of technical concepts, create a sense of connection, entertain, influence, and inspire

American literary scholar Jonathan Gottschall once said, “We are, as a species, addicted to story. Even when the body goes to sleep, the mind stays up all night, telling itself stories.”

My former Maths teacher shared a story about flipping a coin to explain the repeated multiplication concept. If someone flips a coin n times, then there are 2n possible outcomes. Suffice to say that, by giving meaning and context to abstract concepts, repeated multiplication had become a lasting memory. It’s been a while since I reminded myself of the laws of exponents. But I’ve always been keenly aware of how powerful stories have been to help me learn and remember new concepts.

Trained as a wet lab biologist, writing codes and wrangling big datasets have always felt unnatural. When I was a novice data wrangler, not a day goes by without me going down the rabbit hole of Google or Stack Overflow to look for solutions to my coding problems. Nonetheless, I found online data science tutorials and forums rather unwieldy, disconnected, overwhelming, and hard to digest.

As the latest Squid Game show trended on social media with the games being replicated thousands of times and memes after memes were released, an idea struck me. Attempting to multitask, write R codes while watching an episode of the show, I thought, let’s try creating something new today by combining two unrelated topics into something educational and hopefully entertaining.

Concepts in data science can be hard to grasp. Tutorials are laced with technical jargon which is made worse by endless reams of text that even the most exciting topic becomes boring very rapidly. I wanted to create memorable and riveting video tutorials by embracing storytelling that would attract even the unlikeliest of individuals, my mother, a non-scientist from the boomer generation, or high-school students, to watch through the whole tutorial and learn something new at the end.

As a child, I’ve always loved art and design. Being a scientist and academic wasn’t something I considered. I’ve suppressed my interest in graphic design and art so that I’m not seen as an unfocused scientist. But I’ve never lost my creative streak. Cartoons and animation are a gateway to great storytelling. Saturday morning cartoons were the highlights of my week. Cartoons are evergreen – many adults still watch them due to the psychological benefits. Animation gives us complete control over the narrative, focusing on creating an immersive data science experience for the viewer. It allows creators to fully control the extent of emotions displayed by the animated characters. Strategic additions of drama, humor or suspense help create a connection with your audience, which is important since we only have a few minutes to build a relationship with your viewers in a 5-minute tutorial video.

In the spirit of public engagement and community-based learning, I embarked on a journey to ‘storify’ data science through YouTube videos. We started with a video on unsupervised machine learning to figure out which Squid Game player would win the first game. This video was inspired by an actual research article published by my lab, where we used ensemble clustering to identify clinical and genomic features that could predict cancer patient outcomes.

Simplifying Science. Video permission by blog author

The tug-of-war episode of Squid Game lends itself to exploring the relationships between players using social network analysis and graph theory. In a network graph, nodes represent players that are connected by a link (or edge) if they both appeared in the same scene. We created an animated video on network analysis to visualize players’ relationships and look for clusters of players, which in turn might dictate their choice of teammates.

These videos have helped me grow in my data science learning journey. Each video has its own learning goal. It explains the core concepts and it’s based on problem-solving. Creating these videos forces me to approach a data science problem systematically, analyze the problem, and plan a solution using relevant statistical principles. For each video, I create a mock dataset based on the problem statement. This pushes me to ask the right questions to figure out what data variables I need to include in the mock dataset. Planning tutorials also tapped into my metacognition, which makes me think about my data science learning more intentionally. Evaluating my thought process may, in the long run, help me pick more effective data science solutions for my everyday research.

If I can convince a single person, who has zero experience in data science and research to open their laptops and learn along with the videos using the sample datasets and scripts provided, I’d consider this endeavor worthwhile. As a result of the videos, I’ve received messages from students and the public sharing their experience of learning data science for the first time following the tutorials. To them, I say, “Focus on being functional and learning just enough to get by.” You can achieve a lot with “just enough.” These videos help viewers learn data science by doing specific tasks. Viewers get to practice the fundamentals of asking questions, dabbling in statistics, and visualizing results.

If you’re a visual learner like me, storifying data science helps with the learning of complex concepts as people tend to remember more of what they see than what they read. My goal is to reach unreachable audiences such as high-school students and non-scientists and empower them to get started with data science projects. I want to help weave data science into the fabric of society, by bridging the silos between art and science through these animated tutorials. With my anthropomorphic friends in our Data Science tutorial series, I hope I have convinced you that explaining data science through stories and animations should have a place in mainstream data science learning.

Author: Alvina G. Lai is an associate professor, data scientist, illustrator, science communicator and content creator for the Zero to Data Science YouTube channel. She leads a Health Informatics group at University College London and has won multiple awards for her research, including a recent award from the Royal College of Physicians. Her work has also been featured in a BBC Panorama Documentary.

  1. Good for you I am proud of your efforts and disciplines performed not mention your attitude towards finding just one person to see it and then it spreads because one person can change the world. I really am impressed with your ability to see the use of the application and to create a new dataset to fit in an observation of value. Thank you for your time and hard work this is worth it to me to see someone of your caliber make time for others. Thank you for your service… MackD93

Leave a Reply

Your email address will not be published. Required fields are marked *

Add your ORCID here. (e.g. 0000-0002-7299-680X)

Related Posts
Back to top