1
00:00:00,000 --> 00:00:05,760
Hey everyone, my name is Mukundan Sankar. I'm the host of the Data, AI, Productivity and Business

2
00:00:05,760 --> 00:00:10,960
with a little personality podcast. We're going to dive into something really exciting. It's another

3
00:00:10,960 --> 00:00:16,560
AI-powered solution and this time I'm applying it to PDFs. Okay, so let's dive into something that

4
00:00:16,560 --> 00:00:22,160
I think is going to resonate with anyone who works with data and AI, PDFs. They're the lifeblood of

5
00:00:22,160 --> 00:00:27,520
our field, but who actually has the time to read all of them? That's where this podcast gets really

6
00:00:27,520 --> 00:00:34,400
interesting. It's an AI-powered tool that I built here that basically turbocharges your PDF game.

7
00:00:34,400 --> 00:00:41,040
It can extract key information, turn it into audio and even generate Q&A. That's not all. It uses some

8
00:00:41,040 --> 00:00:46,880
familiar tools in the data science world like PyPDF2 and Hugging Face. So I'm curious about

9
00:00:46,880 --> 00:00:53,760
how much of a pain point PDFs are in your work. For me, it's a major one. In data science and AI,

10
00:00:53,760 --> 00:00:59,360
staying current means sifting through a mountain of PDFs. And it's not just about keeping up.

11
00:01:00,080 --> 00:01:07,200
Sometimes I really need to dig into a specific paper for a project. Even skimming for key

12
00:01:07,200 --> 00:01:13,120
takeaways can take forever, especially with all the technical jargon. That's where I believe this

13
00:01:13,120 --> 00:01:21,120
tool can be a game changer. Now let me break it down for you how this works. If I start with the

14
00:01:21,120 --> 00:01:29,920
first step, it uses PyPDF2, which I spoke about. PyPDF2, let me talk about that briefly. So PyPDF2

15
00:01:29,920 --> 00:01:36,320
helps you to extract all the key text from the PDF that you uploaded. A lot of people might be

16
00:01:36,320 --> 00:01:42,480
familiar with this PyPDF2, especially if you're in the Python world and the data science world. So

17
00:01:42,480 --> 00:01:49,680
it's a, and those of you who aren't, it's a Python-based library specifically designed for

18
00:01:49,680 --> 00:01:59,200
working with PDFs. But the real magic happens in the next step. Summarization. The tool that I built,

19
00:01:59,200 --> 00:02:04,640
it uses a summarizer from Hugging Face. And for those who aren't familiar with Hugging Face,

20
00:02:04,640 --> 00:02:12,080
it is a platform with a ton of pre-trained AI models. So you can just use any AI model, like

21
00:02:12,080 --> 00:02:19,440
text to video, text to audio, and all that would be available there. And this summarizer that I

22
00:02:19,440 --> 00:02:26,720
built, yeah, you can obviously generate summary too. Summarizer for this particular use case,

23
00:02:26,720 --> 00:02:34,800
it condenses those pages of information into concise digestible chunks. So imagine being able

24
00:02:34,800 --> 00:02:40,400
to grasp the core concepts of a complex research paper without getting lost in the details.

25
00:02:40,400 --> 00:02:46,640
And for those of you who are always on the go, get this. The tool can convert those summaries

26
00:02:46,640 --> 00:02:52,800
into audio files using Google Text to Speech. That's amazing, right? Have you ever used Google

27
00:02:52,800 --> 00:02:58,320
Text to Speech? It's again another Python-based library, but like as the name suggests, it

28
00:02:58,320 --> 00:03:06,400
converts text to audio. And think about how you can use text to audio for your use case, for

29
00:03:06,400 --> 00:03:13,600
anything, right? Like you're on the go, maybe you're driving somewhere and you don't have access to

30
00:03:13,600 --> 00:03:18,240
maybe like your phone or you just like have access to a Bluetooth or something, right?

31
00:03:19,280 --> 00:03:24,640
You are listening to something on the podcast and this podcast is something that you created.

32
00:03:25,760 --> 00:03:30,240
That's where Google Text to Speech comes in. So basically anything that you've written,

33
00:03:30,240 --> 00:03:35,920
you're converting that into audio. So you are listening essentially a podcast made by you

34
00:03:36,480 --> 00:03:44,160
while working out. Another company that does this these days is Google again on their Notebook LM

35
00:03:44,160 --> 00:03:51,680
product. So they are converting your regular text to a podcast. But yeah, this is a bit different,

36
00:03:51,680 --> 00:03:55,760
but this is something that you built yourself. So there's like a different pleasure in that.

37
00:03:55,760 --> 00:04:00,880
So just coming back to this, I think the possibilities are pretty exciting with this tool.

38
00:04:01,760 --> 00:04:08,800
It's like having your PDF on the go, but you are listening to it. Like I said,

39
00:04:10,000 --> 00:04:15,440
it gets even more interesting. Another function which I really want to talk about

40
00:04:15,440 --> 00:04:24,000
is hugging faces, question and answer features. So basically it is used to generate questions

41
00:04:24,000 --> 00:04:32,080
and answers from the PDF content. Now imagine preparing for an important meeting or presentation

42
00:04:32,080 --> 00:04:38,000
and you could have AI highlight the key points and even anticipate the questions

43
00:04:38,000 --> 00:04:44,560
your audience might ask. Now what's fascinating here is that it goes beyond just summarizing

44
00:04:44,560 --> 00:04:49,440
the information. It's creating a more interactive way to engage with the content.

45
00:04:51,680 --> 00:04:58,000
Now I built out a front end application using Streamlit, which is a Python based library,

46
00:04:58,800 --> 00:05:06,000
but it will help you to see the content in a more interactive way. So I'm going to go ahead and

47
00:05:06,000 --> 00:05:12,240
show you how it works, but it will help you to see data on a website. So it's like basically

48
00:05:12,240 --> 00:05:18,320
you're creating a website and Python by itself wouldn't have been able to do it, I think.

49
00:05:18,320 --> 00:05:23,520
And Streamlit really makes that process easy. So anybody who doesn't know how to do front end

50
00:05:23,520 --> 00:05:32,800
programming can use Streamlit library from Python to make an application. So that's what I did in

51
00:05:32,800 --> 00:05:40,960
I created an application which is asking the user to upload a PDF and generate a summary,

52
00:05:42,080 --> 00:05:48,480
generate questions and answers and generate even audio. What is audio you might ask?

53
00:05:49,680 --> 00:05:53,360
So what I spoke about before was the text to speech feature, right?

54
00:05:53,360 --> 00:06:03,280
With the text to speech feature, it is converting the highlights of this text into an audio.

55
00:06:04,000 --> 00:06:07,920
And this can obviously extend to other things like basically you can just convert the whole

56
00:06:07,920 --> 00:06:13,600
text into audio. So I just didn't choose to do that, but I think if you wanted that could

57
00:06:13,600 --> 00:06:20,800
be done as well. So all you have to do is just change a few bits of code and the blog post,

58
00:06:20,800 --> 00:06:26,560
which actually has the step-by-step approach will be linked in the show notes. So do keep an eye on

59
00:06:26,560 --> 00:06:33,600
that. So yeah, just think about this, like this PDF is not just about saving time. It can help you

60
00:06:33,600 --> 00:06:40,640
absorb complex information more effectively, right? Imagine being able to quickly grasp the

61
00:06:41,360 --> 00:06:47,600
key takeaways from a dozen research papers on a new technique. So you're learning something.

62
00:06:47,600 --> 00:06:54,720
The other day I was learning about LLM quantization or something. And see, I mean,

63
00:06:54,720 --> 00:06:58,800
I don't even know that topic yet. I don't know how to pronounce it, but I was trying to learn this,

64
00:07:00,800 --> 00:07:08,400
get a summary of this using my tool. And it did provide like a quick summary. So it was nice,

65
00:07:08,960 --> 00:07:13,040
obviously, I think notebook LLM is greater, but I think this is something that

66
00:07:13,040 --> 00:07:17,360
you can build yourself and that gives me the pleasure of using my tool.

67
00:07:18,560 --> 00:07:26,800
I can obviously use more improvements. And I would say just play around with the tool and

68
00:07:27,360 --> 00:07:31,520
think what else you can do it to make it better. And if you have any suggestions,

69
00:07:32,480 --> 00:07:38,800
feel free to send them my way. So it's important to definitely acknowledge that no technology is

70
00:07:38,800 --> 00:07:46,400
perfect. And we know that AI is just a tool, right? It has its limitations, but it's exciting to see

71
00:07:46,400 --> 00:07:53,040
how AI is evolving so rapidly. And it's helping tackling some of the biggest challenges we are

72
00:07:53,040 --> 00:08:03,040
facing, right? Information overload. So obviously, like I said, this is just another AI application

73
00:08:03,040 --> 00:08:12,080
and I'm hoping to keep generating more AI content, like more how tos, like how to do this using AI

74
00:08:12,080 --> 00:08:21,440
and how to do that. So that's something that I get excited to do. Now, you can also think about

75
00:08:21,440 --> 00:08:26,960
how you can do like an AI powered analysis on websites, right? And I just spoke about PDFs,

76
00:08:26,960 --> 00:08:35,920
so we can extend it to websites, blog posts, and even books. It could completely revolutionize

77
00:08:35,920 --> 00:08:44,000
how we learn and stay informed. So how could this tool fit into the workflow of someone in our field?

78
00:08:44,800 --> 00:08:50,080
First and foremost, like I said, it's like a massive time saver. Now, instead of spending hours

79
00:08:50,080 --> 00:08:55,920
on researching every paper from start to end, I could just use this tool quickly to extract

80
00:08:55,920 --> 00:09:01,280
the key findings, identify the common themes, and even generate potential questions to guide

81
00:09:01,280 --> 00:09:08,400
a deeper dive into specific aspects of the research. And that's a mouthful. It's just,

82
00:09:09,040 --> 00:09:13,440
I want to be able to save time, right? And it's just not about research, like I said,

83
00:09:14,560 --> 00:09:21,200
you're preparing for work, presentations, or a new visualization technique. And this tool could

84
00:09:21,200 --> 00:09:26,240
quickly summarize the key points from several articles and white papers and even generate

85
00:09:26,960 --> 00:09:32,720
potential questions that the audience might ask. Plus, it could be a life saver when I need to

86
00:09:32,720 --> 00:09:38,160
quickly get up to speed on a new technology, right? Like if you want to learn something quickly,

87
00:09:38,800 --> 00:09:47,280
a quick summary, yeah, just made your life very much easy. So I just always go for

88
00:09:47,280 --> 00:09:53,520
optimize whenever I can. So I'm always trying to use technology, especially AI these days,

89
00:09:54,320 --> 00:09:59,600
to make my life easier. And I hope you're also thinking on the similar lines.

90
00:10:01,840 --> 00:10:07,920
So just imagine not just your PDF experience is turbocharged, getting key information,

91
00:10:09,760 --> 00:10:14,000
and you're also getting it in audio, you're asking the questions and answers and getting

92
00:10:14,000 --> 00:10:22,640
answers from it all by just using Python based libraries and AI. So this working through

93
00:10:22,640 --> 00:10:29,520
PDFs is like a huge pain point and tools like this would definitely make life easier.

94
00:10:30,640 --> 00:10:35,120
So staying current is definitely something we need to do in the data science and AI field.

95
00:10:35,120 --> 00:10:43,760
So instead of just like going through so much PDF, just make your life easy and summarize them and

96
00:10:45,200 --> 00:10:49,600
build out an AI power tool that really does everything for you.

97
00:10:53,360 --> 00:10:58,960
So where could this go? I would say no tool is perfect. Like AI has its limitations.

98
00:10:58,960 --> 00:11:05,120
That said, it's exciting to see how AI evolves into some of our, you know, evolves to tackle some of

99
00:11:05,120 --> 00:11:12,880
our biggest challenges. And finally, just would like to say as AI continues to advance,

100
00:11:12,880 --> 00:11:19,840
we'll even see more sophisticated tools emerge that will also that will transform how we interact

101
00:11:19,840 --> 00:11:27,520
with information. It's an exciting time to be in this field. So question number one,

102
00:11:27,520 --> 00:11:34,880
so question for you, what PDF would you run through this tool? How would you use all the time that

103
00:11:34,880 --> 00:11:42,400
you saved? For those of us always under pressure to stay ahead, this tool could be a game changer.

104
00:11:44,000 --> 00:11:50,400
But as with any powerful technology, it's essential to use it properly and responsibly.

105
00:11:50,400 --> 00:11:58,640
And it's while it is really super tempting to see AI as a silver bullet, I do approach it as a

106
00:11:58,640 --> 00:12:04,960
compliment to my own expertise. Ultimately, the balance lies in leveraging AI's power with

107
00:12:06,000 --> 00:12:11,680
while keeping your critical thinking skills sharp. And AI challenges us to become even

108
00:12:11,680 --> 00:12:18,560
better thinkers and learners, helping us engage with information actively, question it, and use

109
00:12:18,560 --> 00:12:24,080
it to strengthen our understanding. So I hope you got some value out of this podcast today and I

110
00:12:24,080 --> 00:12:29,280
hope to catch you in the next one. This is Mukundan Sankar. Thank you and I will see you in the next

111
00:12:29,280 --> 00:12:51,280
episode.

