1
00:00:00,000 --> 00:00:16,000
Music

2
00:00:16,000 --> 00:00:20,000
Hi everyone and welcome to another episode of Data and AI with Mukundan.

3
00:00:20,000 --> 00:00:25,000
The show where we dive deep into the world of data science, artificial intelligence and everything in between.

4
00:00:25,000 --> 00:00:31,000
I am your host Mukundan Sankar and today we are unlocking a treasure chest of data visualization secrets.

5
00:00:31,000 --> 00:00:36,000
Let me ask you this, have you ever created a chart that looks ok but doesn't quite pop?

6
00:00:36,000 --> 00:00:47,000
Or maybe you have shown a graph to your team and their eyes glaze over but you know it's just that because it doesn't tell the story quite clearly.

7
00:00:47,000 --> 00:00:52,000
Well if you are nodding along, today's episode is just for you.

8
00:00:52,000 --> 00:00:58,000
I am about to reveal nine hidden or I guess less spoken about plotly tricks.

9
00:00:58,000 --> 00:01:06,000
Plotly is a Python data visualization library which can take your visuals from ordinary to extraordinary.

10
00:01:06,000 --> 00:01:14,000
Whether you are a seasoned data scientist or just starting out, these techniques will help you make a long lasting impression.

11
00:01:14,000 --> 00:01:16,000
So let's get started.

12
00:01:16,000 --> 00:01:22,000
So before I dive into the techniques, let's talk about why plotly is a game changer.

13
00:01:22,000 --> 00:01:32,000
So unlike traditional libraries like Matplotlib or Seaborn, which I am sure you are familiar with if you are using Python data visualization,

14
00:01:32,000 --> 00:01:40,000
Plotly specializes in creating interactive and dynamic visualizations.

15
00:01:40,000 --> 00:01:46,000
It's perfect for storytelling because it allows you to add layers of interaction and depth.

16
00:01:46,000 --> 00:01:54,000
But here's the thing, most people only scratch the surface so what it can do,

17
00:01:54,000 --> 00:02:01,000
they stick to basic bar charts or scatter plots and miss out on the powerful features which are hidden below the surface.

18
00:02:01,000 --> 00:02:03,000
And that's what we are here to explore today.

19
00:02:03,000 --> 00:02:17,000
So let me talk about nine hidden plotly tricks which is something that you would want to use for your data visualization task while you are presenting it to your business stakeholders.

20
00:02:17,000 --> 00:02:20,000
Alright, so let's dive into these tricks.

21
00:02:20,000 --> 00:02:25,000
I'll break down each one of them for you, explain why it's considered hidden

22
00:02:25,000 --> 00:02:36,000
and how you can use it to unlock deeper insights from your data.

23
00:02:36,000 --> 00:02:40,000
First up, we have the custom pairwise correlation matrix.

24
00:02:40,000 --> 00:02:43,000
This is a staple for understanding feature relationships.

25
00:02:43,000 --> 00:02:54,000
Now, most people, they create basic heat maps, but you, you can make them far more informative by adding annotations and custom color scales.

26
00:02:54,000 --> 00:03:02,000
So I've used real world data and I have a notebook where I've done these hidden tips so that you can use them as well.

27
00:03:02,000 --> 00:03:05,000
So you can have a look at this notebook and it has all these nine hidden tips.

28
00:03:05,000 --> 00:03:17,000
But I'm just going to talk through the notebook and obviously you can just use the link in the show notes to actually look at what data I've used and how the visuals actually come through.

29
00:03:17,000 --> 00:03:25,000
Right, so for my use case, I've used data sets from UCI machine learning repository.

30
00:03:25,000 --> 00:03:36,000
So this is basically, if you're not familiar with UCI machine learning repository, it's this machine learning data set repository from University of California Irvine.

31
00:03:36,000 --> 00:03:42,000
They've been providing data sets for anybody who's starting out in the field of data science.

32
00:03:42,000 --> 00:03:50,000
So if you are just starting out, you may not as be familiar, if you're seasoned, I'm sure you are definitely familiar with this.

33
00:03:50,000 --> 00:04:02,000
But again, so the links to whichever data sets I've used are there in the code in the notebook, which I've linked in the show notes.

34
00:04:02,000 --> 00:04:06,000
So the first one for what I used here was the wine quality data set.

35
00:04:06,000 --> 00:04:13,000
And in this, I use this technique to highlight how alcohol levels correlate with the wine quality.

36
00:04:13,000 --> 00:04:22,000
So why this matters is stakeholders, they can see instantly which features drive the outcomes that they care about.

37
00:04:22,000 --> 00:04:32,000
And similarly for whatever task that you're working on, right, I mean, you can use like annotations to highlight something which you want your stakeholder to look at.

38
00:04:32,000 --> 00:04:36,000
The second thing I wanted to look at was like a dynamic data highlight.

39
00:04:36,000 --> 00:04:44,000
So this is like a conditional formatting in Excel, except think about Excel conditional formatting on steroids.

40
00:04:44,000 --> 00:04:53,000
For instance, you can highlight wines with high alcohol and low pH, which are the key markers, which are the key markers of premium quality.

41
00:04:53,000 --> 00:04:57,000
This technique simplifies analysis for decision makers.

42
00:04:57,000 --> 00:05:02,000
The third thing I wanted to talk about was density contours for class distribution.

43
00:05:02,000 --> 00:05:07,000
Now, have you ever felt like scatter plots don't quite tell the whole story?

44
00:05:07,000 --> 00:05:12,000
Adding density contours can show you where the data clusters and overlaps.

45
00:05:12,000 --> 00:05:18,000
I use this to visualize the separability of species in the Iris data set.

46
00:05:18,000 --> 00:05:23,000
It's a fantastic way to evaluate classic models visually.

47
00:05:23,000 --> 00:05:27,000
The fourth thing I talk about was faceted histograms.

48
00:05:27,000 --> 00:05:32,000
Now faceted histograms lets you break down data into subgroups.

49
00:05:32,000 --> 00:05:41,000
For example, I've used the car evaluation data set where I created facets to compare buying prices by the class.

50
00:05:41,000 --> 00:05:48,000
Now it's like getting multiple charts in one view, which is incredibly useful for comparing categories.

51
00:05:48,000 --> 00:05:53,000
The fifth thing I talked about was adding threshold lines.

52
00:05:53,000 --> 00:05:57,000
Now threshold lines are perfect for emphasizing decision boundaries.

53
00:05:57,000 --> 00:06:07,000
I've used a blood donation data set here where I've added a line to show the critical recency threshold for donors.

54
00:06:07,000 --> 00:06:13,000
Now it's an easy way to draw attention to actionable insights, and that's why adding threshold lines is.

55
00:06:13,000 --> 00:06:18,000
Is very hidden and very important.

56
00:06:18,000 --> 00:06:20,000
Is very important.

57
00:06:20,000 --> 00:06:21,000
Custom annotations.

58
00:06:21,000 --> 00:06:25,000
So custom annotations transform your visuals into storytelling tools.

59
00:06:25,000 --> 00:06:31,000
Now in the car evaluation data set, I use annotations to label the cheapest car options.

60
00:06:31,000 --> 00:06:37,000
This simple addition makes the chart immediately actionable for consumers.

61
00:06:37,000 --> 00:06:43,000
3D scatter plots.

62
00:06:43,000 --> 00:06:47,000
Now sometimes you're looking at data and 2D just doesn't cut it right?

63
00:06:47,000 --> 00:06:53,000
With 3D scatter plots, you can uncover relationships that are invisible in two dimensions.

64
00:06:53,000 --> 00:07:03,000
For example, I use the Iris data set to visualize class separability in three dimensions, thereby revealing patterns that I've missed otherwise.

65
00:07:03,000 --> 00:07:07,000
Tip number eight.

66
00:07:07,000 --> 00:07:13,000
It's animated visualizations. Animations, they can show how data evolves over time.

67
00:07:13,000 --> 00:07:25,000
Using the energy efficiency data set, I created an animated plot to visualize how compactness and surface area influence heating and cooling over time.

68
00:07:25,000 --> 00:07:30,000
And surface area influence heating and cooling over time.

69
00:07:30,000 --> 00:07:34,000
It's a powerful way to reveal dynamic patterns.

70
00:07:34,000 --> 00:07:41,000
And finally, the last step is ironically custom tooltips.

71
00:07:41,000 --> 00:07:47,000
So instead of static charts, tooltips let users interact with your visuals.

72
00:07:47,000 --> 00:07:58,000
So for this, I used the adult income data set where I added age, education and income class to the tooltip, making the chart far more engaging and informative.

73
00:07:58,000 --> 00:08:02,000
So I chose multiple data sets because I wanted to give some variety to this.

74
00:08:02,000 --> 00:08:08,000
But obviously you can just use one and just play around with it.

75
00:08:08,000 --> 00:08:11,000
Yeah, so those were like nine tips I had.

76
00:08:11,000 --> 00:08:17,000
But now you wanted to be wondering how can I use this in the real world?

77
00:08:17,000 --> 00:08:20,000
Like how do I apply these tricks to my work?

78
00:08:20,000 --> 00:08:23,000
Let's look at some real world scenarios.

79
00:08:23,000 --> 00:08:31,000
So in business intelligence, you can highlight key trends in sales or customer churn data using dynamic highlights.

80
00:08:31,000 --> 00:08:40,000
Scientific research, if you're in that, you can use density contours to understand clustering in biological data sets.

81
00:08:40,000 --> 00:08:51,000
And in education, you can use threshold lines which can help teachers identify at risk students based on the performance metrics.

82
00:08:51,000 --> 00:08:58,000
Now, these techniques aren't just about creating prettier visuals.

83
00:08:58,000 --> 00:09:06,000
They're about drawing actionable insights and telling compelling stories with your data.

84
00:09:06,000 --> 00:09:10,000
So if these techniques spark your curiosity, don't just stop here.

85
00:09:10,000 --> 00:09:19,000
Head over to my blog post and the notebook, which I mentioned in the show notes for the full code examples and step by step guides.

86
00:09:19,000 --> 00:09:25,000
I've linked all the data sets from the UCI machine learning library in the notebook.

87
00:09:25,000 --> 00:09:27,000
So you can try these tricks.

88
00:09:27,000 --> 00:09:29,000
You can try these tricks for yourself.

89
00:09:29,000 --> 00:09:33,000
And if you found today's episode valuable, please subscribe and leave a review.

90
00:09:33,000 --> 00:09:36,000
Your support helps us bring you more episodes like this one.

91
00:09:36,000 --> 00:09:41,000
Finally, let me know in the comments or on social media.

92
00:09:41,000 --> 00:09:44,000
Which of these tricks are you most excited to try?

93
00:09:44,000 --> 00:09:51,000
Or if you've already used them, I'd love to hear more about your experiences.

94
00:09:51,000 --> 00:09:54,000
Thanks for tuning in to Data and AI with Mukundan.

95
00:09:54,000 --> 00:09:57,000
Remember the next time you create a chart, don't just settle for the ordinary.

96
00:09:57,000 --> 00:10:00,000
Use these tricks to make your visuals extraordinary.

97
00:10:00,000 --> 00:10:22,000
Until next time, keep exploring, keep experimenting and keep pushing the boundaries of what's possible with data.

