1
00:00:00,000 --> 00:00:05,760
Hello there, I'm George Hall and welcome back to another one of our Analytics Anonymous sessions,

2
00:00:05,760 --> 00:00:10,800
where we take a small, bite-sized look at data, insights, analytics and more.

3
00:00:10,800 --> 00:00:15,840
I'm joined today by Zoe Hitchens, an Analytical Manager here at Good Growth. Zoe, how are you?

4
00:00:15,840 --> 00:00:17,520
I'm all good, George. Thank you for having me.

5
00:00:17,520 --> 00:00:22,000
Good, good. Now, Zoe, I know you've been incredibly busy lately with all sorts of projects going on,

6
00:00:22,000 --> 00:00:24,320
so I'm excited to hear what you've got for us today.

7
00:00:24,320 --> 00:00:28,880
So we're going to be diving into sort of the key to good data.

8
00:00:28,880 --> 00:00:32,560
So having a little look at sort of the key principles that we sort of like to look at,

9
00:00:33,520 --> 00:00:36,560
and just understanding the key to good data.

10
00:00:37,280 --> 00:00:41,520
Perfect. And we say good data there. What does good data mean in the context of

11
00:00:41,520 --> 00:00:44,320
digital growth and then in the wider e-commerce industry?

12
00:00:45,040 --> 00:00:49,840
So it has probably a few different layers to it, but ultimately good data needs to be sort of

13
00:00:49,840 --> 00:00:54,000
fit for purpose. So whether that be for the individual, whether it be for the business,

14
00:00:54,000 --> 00:00:59,280
it should probably be quite like clean, structured, easy to use. But ultimately,

15
00:00:59,280 --> 00:01:03,760
businesses need to be able to sort of analyse that data so they can understand their performance

16
00:01:03,760 --> 00:01:08,720
against their sort of set business objectives and requirements. They need to have that confidence

17
00:01:08,720 --> 00:01:13,680
around their data, whether that be their architecture of their data and the strategy.

18
00:01:13,680 --> 00:01:17,760
Otherwise, how are they sort of expected to make those informed business decisions?

19
00:01:18,400 --> 00:01:22,480
And then that data point bleeds into insight, which I know is one of the main things that you

20
00:01:22,480 --> 00:01:27,200
work on. And without the insight, you can't really optimise your testing and your delivery.

21
00:01:27,920 --> 00:01:35,440
Yeah, absolutely. So sort of that quality of that data means that businesses might make sort of

22
00:01:35,440 --> 00:01:40,960
ill-informed decisions if they don't have that trust behind it. So using poor quality data or

23
00:01:40,960 --> 00:01:46,960
incorrect data even, yeah, might mean that sort of businesses or teams within a business make some

24
00:01:46,960 --> 00:01:51,600
sort of questionable assumptions about their performance or incorrect assumptions about

25
00:01:51,600 --> 00:01:56,960
their reporting. And that could form the basis of some quite big business decisions. It might be

26
00:01:56,960 --> 00:02:01,680
that they're seeing a channel converting really well, so they're putting a lot of money behind it

27
00:02:01,680 --> 00:02:05,520
and actually, in essence, is not performing any better than any of the other channels. So

28
00:02:06,160 --> 00:02:11,040
it can really affect the bottom line, sort of rolled up. So yeah, it's really important that

29
00:02:11,040 --> 00:02:18,320
they have a good base for their reporting. Also in terms of sort of organisational efficiency as

30
00:02:18,320 --> 00:02:24,800
well. So if you're sort of the raw format in which you're getting this data is quite messy,

31
00:02:24,800 --> 00:02:30,560
it needs a lot of processing and cleaning, then this is not wasted effort, but effort that could

32
00:02:30,560 --> 00:02:35,680
be better spent elsewhere if there was a sort of more formalised process already in place,

33
00:02:35,680 --> 00:02:39,520
more automated process for collecting that data. So it's in that ready-to-use format.

34
00:02:40,320 --> 00:02:44,720
Because if the data is hard to analyse, you should probably look to make some changes there.

35
00:02:44,720 --> 00:02:50,240
And speaking about good data, what are the key characteristics of what good data actually is?

36
00:02:50,240 --> 00:02:54,800
I mean, are there principles that you use or the wider Good Growth team use to differentiate the

37
00:02:54,800 --> 00:03:00,720
data between good and bad? So within Good Growth, we have sort of five principles, we like to call

38
00:03:00,720 --> 00:03:08,720
them, for what we deem as sort of good data. These are accurate, trended, explainable,

39
00:03:08,720 --> 00:03:14,000
reproducible and unbiased. If we were to try and make a fun acronym out of that,

40
00:03:14,000 --> 00:03:18,480
we struggled. A true sounding like a sneeze was the best we could come up with for that one,

41
00:03:18,480 --> 00:03:21,840
but I'm not going to coin that one. I don't think we use that one internally.

42
00:03:22,720 --> 00:03:28,560
But yeah, if we just quickly go through that list. So accuracy of data, we briefly touched on the

43
00:03:29,200 --> 00:03:35,920
collection of the data there, really ensuring that when you're going through and setting up

44
00:03:35,920 --> 00:03:42,720
your sort of traffic analysis, your event structure, even things like campaign strings and stuff like

45
00:03:42,720 --> 00:03:46,320
that, you want to be quite rigorous and meticulous and just make sure these things are correct.

46
00:03:47,600 --> 00:03:50,480
You want an easy life, you want things to be pulling through correctly, you don't want to

47
00:03:50,480 --> 00:03:55,280
have to worry about negating the double counting of events in certain areas of the page or

48
00:03:55,280 --> 00:03:59,280
making sure that actually some traffic that arrived from here actually is coming from here.

49
00:03:59,280 --> 00:04:04,720
And all the sort of caveats that come in between, you ultimately want an easy life and to be able

50
00:04:04,720 --> 00:04:07,920
to share that data with other people in the business as well without all of those caveats.

51
00:04:07,920 --> 00:04:13,040
And then when stitching together data sets, I think that's quite a common pitfall where people

52
00:04:13,040 --> 00:04:18,560
start to come a little bit unstuck in their data quality. So you want to sort of use as many sort

53
00:04:18,560 --> 00:04:24,400
of automated processes and scheduling as possible and essentially just reduce that human input.

54
00:04:25,440 --> 00:04:30,720
It avoids the risk of sort of human error. Everyone's human. We might have a typo every

55
00:04:30,720 --> 00:04:37,760
now and then. So yeah, avoid those sort of manual imports. If we go into sort of trending data,

56
00:04:37,760 --> 00:04:43,120
I think was the next one down the list. This can be obviously looking at trends within a

57
00:04:43,120 --> 00:04:48,960
short period of time, looking at comparisons sort of year on year. You ultimately want to understand

58
00:04:48,960 --> 00:04:53,280
if there's any sort of seasonality affecting your data, any sort of macroeconomic factors,

59
00:04:53,840 --> 00:04:58,240
but also just within a sort of maybe an A-B testing window, if you're seeing a significant

60
00:04:58,240 --> 00:05:03,120
impact for one of your say control or challenges, you want to know that that's been a consistent

61
00:05:03,120 --> 00:05:07,520
impact throughout the duration of your test. If it's just one day that's sort of skewing the data,

62
00:05:07,520 --> 00:05:10,800
you know that there's probably not enough rigor behind it to say, okay, we've got the confidence

63
00:05:10,800 --> 00:05:15,840
to say that that actually has a true impact. And then that feeds into the sort of explainable

64
00:05:15,840 --> 00:05:21,440
other data point there. So if you're seeing that data isn't particularly trended, you're seeing

65
00:05:21,440 --> 00:05:26,800
some quite odd behavior, whether that be in sort of traffic conversion rate, you ultimately want

66
00:05:26,800 --> 00:05:31,520
to know what's driving this. And this is where sort of data and insight come together. You see,

67
00:05:31,520 --> 00:05:36,320
yeah, spike in sales on day X is due to, okay, that was because our marketing department sent

68
00:05:36,320 --> 00:05:40,880
an email out that day with a 50% discount code, you want to be able to sort of marry up all the

69
00:05:40,880 --> 00:05:46,400
different other activity happening with the business. So you know what has driven those impacts.

70
00:05:46,400 --> 00:05:51,680
Reproducibility is just again, validating the results and findings that you've seen,

71
00:05:51,680 --> 00:05:56,240
whether that be you being able to pull that same data again, or other people in the business being

72
00:05:56,240 --> 00:06:01,520
able to get to the same answer. It just makes it a lot cleaner and scalable as well within the business.

73
00:06:01,520 --> 00:06:06,800
And then the final one is around data sort of being unbiased. That's kind of interesting one.

74
00:06:06,800 --> 00:06:12,160
If you're looking at sort of is data not just coming from your internal collect, internally

75
00:06:12,160 --> 00:06:17,840
collected data? Is it coming from an external source? Is your data representative of your entire

76
00:06:17,840 --> 00:06:23,120
user base? Are you sure that you're getting your right demographic split, for example, which is

77
00:06:23,120 --> 00:06:32,000
sort of influencing a lot of decisions based on UX and other product decisions. So having biased data

78
00:06:32,000 --> 00:06:36,720
can essentially generate that sort of false insight, which again, in turn leads to those

79
00:06:36,720 --> 00:06:41,760
bad decisions that can affect that bottom line. And then that unbiased aspect as well as from a

80
00:06:41,760 --> 00:06:46,000
data perspective, as well as the sort of approach that you take to data as well, I think is quite

81
00:06:46,000 --> 00:06:52,720
important. We've seen sort of time and time again with whether the data is coming from the

82
00:06:52,720 --> 00:06:57,120
whether it be sort of our testing clients or clients that we've done some insight work with.

83
00:06:57,120 --> 00:07:01,520
They have these big ideas for change. They want to sort of build it up, wrap it up into an experience

84
00:07:01,520 --> 00:07:07,280
or an A-B test. It's had a lot of eyes on it, whether it be from UX, product, experimentation,

85
00:07:07,280 --> 00:07:13,280
everyone getting involved. So if everyone's sort of had the input to it, it must do well, right?

86
00:07:13,280 --> 00:07:18,240
But actually, by the time it goes live, it's had a negative performance. But if it's sort of that

87
00:07:18,240 --> 00:07:23,440
borderline, no real impact shown, it's not doing any better, it's not necessarily doing anything

88
00:07:23,440 --> 00:07:28,000
worse, there might be that tendency to kind of lean towards, well, if it's not having a negative

89
00:07:28,000 --> 00:07:32,400
impact, it's having a positive impact. So just making sure that you don't have that biased

90
00:07:32,400 --> 00:07:38,320
outlook in terms of viewing the data sets and actually validating any impacts.

91
00:07:38,320 --> 00:07:42,640
Mason Higgins And then out of those five, I mean, they're all obviously extremely important,

92
00:07:42,640 --> 00:07:48,000
but is there one that sticks out to you in terms of, I guess, the golden principle? Or is it

93
00:07:48,000 --> 00:07:51,360
a case of they're all so important because they can't exist without each other?

94
00:07:51,360 --> 00:07:57,600
Kate Inglis I mean, yeah, ultimately, they're all very important. They sort of feed into each other.

95
00:07:57,600 --> 00:08:03,280
So yeah, you wouldn't think of having sort of one without the others. But I'd say the accuracy

96
00:08:03,280 --> 00:08:07,840
aspect is probably quite fundamental. And the reason that I sort of came to that one first,

97
00:08:08,480 --> 00:08:13,600
if you know that there's errors in your data, like how can you be sure that you're able to

98
00:08:13,600 --> 00:08:18,640
use that to make any sort of decisions going forwards? I think I've read something online

99
00:08:18,640 --> 00:08:22,960
that's like good data beats opinion. You can sort of imagine lots of people in a boardroom voicing

100
00:08:22,960 --> 00:08:26,720
their opinions and stuff like that. All it takes is for someone to say, this is what the numbers say.

101
00:08:26,720 --> 00:08:33,200
If you have that confidence and the sort of backing behind what you're seeing, it sort of negates any

102
00:08:33,200 --> 00:08:37,440
need for sort of personal preference or business decision. Mason Higgins And so one of the things,

103
00:08:37,440 --> 00:08:43,440
obviously, we focus on massively at Good Growth is customer failure. How does understanding

104
00:08:43,440 --> 00:08:48,480
how does a business's understanding of their customer failure then lead to, I guess, a better

105
00:08:48,480 --> 00:08:52,400
usage of data and a better application of it when you're looking to drive growth?

106
00:08:52,400 --> 00:08:55,440
Sarah Higgins Yeah, so I say understanding customer

107
00:08:55,440 --> 00:09:00,400
failure, it's very much the heart of what we do at Good Growth. And it's pretty crucial to sort of

108
00:09:00,400 --> 00:09:06,000
understand customer behavior, so businesses can respond to that take action, and ultimately

109
00:09:06,000 --> 00:09:13,280
respond to the customer's needs. So again, this notion of failure can be sort of understood using

110
00:09:13,280 --> 00:09:17,440
qualitative and quantitative methods of data capture. So you might be looking at

111
00:09:18,720 --> 00:09:24,080
some journey mapping, you can see what's causing a dropout at X point in the customer journey,

112
00:09:24,080 --> 00:09:29,040
you can see that from traffic analysis, or you might actually have some more explicit data

113
00:09:29,040 --> 00:09:33,520
on the customer voice through some customer testing or surveying. Once you have this sort

114
00:09:33,520 --> 00:09:38,640
of understanding and acceptance that there are points of failure, no one's perfect, you can

115
00:09:38,640 --> 00:09:45,040
really shape not just the actions that you take to mitigate these failure points. So whether that be

116
00:09:45,040 --> 00:09:50,800
revising some of your user journeys, making some UX changes, really targeting some A-B testing

117
00:09:50,800 --> 00:09:56,320
activity around that one area, you can also then shape those KPIs and success metrics that allow

118
00:09:56,320 --> 00:10:00,720
you to measure that impact. So if you know that you've got an area of the site that's a bit of a

119
00:10:00,720 --> 00:10:06,560
bottleneck in terms of conversion, you've got that kind of success metrics for you that allow you to

120
00:10:06,560 --> 00:10:10,880
be able to understand if you're having any impact and being able to shift the needle at all.

121
00:10:10,880 --> 00:10:15,840
Perfect. And I guess closing things off, and maybe putting you on the spot a little bit here,

122
00:10:15,840 --> 00:10:20,400
but is there a key message that you'd give to a data team and insight team that were looking to

123
00:10:20,400 --> 00:10:25,840
get their data back on track or even in a much better place? What would a key takeaway be for

124
00:10:25,840 --> 00:10:32,320
them from this session? I'd say probably start simple. I think businesses get quite caught up

125
00:10:32,320 --> 00:10:40,640
with how granular they can take the data, which in its own right is very important. But equally,

126
00:10:40,640 --> 00:10:46,960
being able to report at a granular level is only sort of well and good if you're able to action at

127
00:10:46,960 --> 00:10:52,400
that level. So you could be looking at sort of your data, whether it be sort of your traffic

128
00:10:52,400 --> 00:10:59,920
split by channels, but by demographic. Okay. We know that 70% of people come to our website

129
00:10:59,920 --> 00:11:04,720
through mobile first, for example, but you're still designing your website desktop first. So

130
00:11:04,720 --> 00:11:08,800
it's little things that sort of, if you have the data available, make sure that you're making some

131
00:11:08,800 --> 00:11:16,160
meaningful changes based on that. Again, sort of channel specific targeting and demographic

132
00:11:16,160 --> 00:11:21,840
specific, maybe like email campaigns and stuff like that. Use the data wisely, but don't get

133
00:11:21,840 --> 00:11:26,800
too caught up in making sure that you're kind of reporting at that granular level, unless you're

134
00:11:26,800 --> 00:11:31,840
able to have the sort of capacity to be able to action. Well, Zoe, an absolute pleasure as

135
00:11:31,840 --> 00:11:36,400
always to have you on the podcast. Always great to get your insights from the front line. And

136
00:11:36,400 --> 00:11:40,160
thank you very much to all of those of you listening. I hope to see you all again very soon.

137
00:11:40,160 --> 00:11:53,040
Thanks Zoe. Thank you very much. Take care.

