How to design high impact product experiments?

Negar Mokhtarnia 🚀

Published in

Product Coalition

6 min readSep 21, 2020

Part 2 of series on experimentation

This Article will cover:

When should product managers experiment?
10 guidelines to design high impact experiments.
And when are experiments invalid?

Running Experiments has become one of the most valuable tools for Product Managers to validate ideas, de-risk releases and measure the impact of their work. Good product teams understand that their customers’ needs and behaviours are ever changing and the way to truly predict what impact any update will have is to try it on a small set of real customers. In addition, the process of ideation and hypothesis driven testing increases the team’s ability to empathize with the users and come up with creative solutions to their pain points.

Often, other functions such as Marketing and Sales also use experimentation to identify target segments, optimize campaign performance and trial pricing strategies. Many companies, in fact, set up cross functional experimentation teams and use the test and learn mentality as their business transformation strategy. Amazon, Netflix and Google have dedicated considerable resources to create an experimentation framework and culture to inform their business decisions.

Since experimentation has become a discipline on its own, there are established best practices for cross functional teams to follow. However, many smart teams still struggle with running experiments that are impactful to the business’ key results and help them advance their understanding of their users. This article tackles the principles that successful teams use to maximize their outcomes.

Firstly, how do you know when you need to experiment?

High risk releases- Experimentation allows you to expose only a small number of users to a high-risk release and get an accurate insight into how your systems and users interact in real time. This saves you from brand backlash and costly roll backs.
Uncertain outcome- Sometimes it’s impossible to predict how other metrics not directly impacted by your experiment will react, especially when the user ecosystem is complicated, such as a 2-sided market or constraint driven services.
UI and experience optimizations- It is common for UX decisions to become a function of the team’s preference over time. By testing variations of a design or user flow you can learn what your user base is most receptive to, regardless of your brand guidelines.
Align on priorities- Minimum Viable Experiments will help you align your team and stakeholders on the priorities by measuring the value to business with no bias.

Ok now that you are sure this is the time for experimentation, how do you design the right experiments?

Here are 10 guidelines to design high impact experiments.

Understand your customers, use cases & personas- Relevant experiment hypotheses require deep qualitative understanding of customers whether it is from user testing, customer reviews or market research. By understanding the various use cases of your product and the customer personas, you can find the gaps in customer experience that can be tested and optimized.
Ground hypotheses in relevant psychological principles- Concepts such as confirmation bias, cognitive load and social proof among other biases and heuristics make the basis of many customer experiences and can be leveraged to change customer behaviour with a well-designed experiment. By identifying customer motivation and which psychological principles are most at play, you can avoid creating experiments that are trying to influence users against human nature and instead bolster existing user tendencies to achieve desired effects.
Research competitors and adjacent markets- There is wisdom in what other companies do, even though it might not work for your customer base at this moment. Try and draw parallels between patterns you see in your data and how other companies that serve similar customers are updating their product. At best, you will find successful ideas that you can develop further and at worst, you will learn how your customers are behaving differently.
Use data to scope and prioritize ideas fairly- Using a consistent framework with data enables the team and the wider organization to align on the same objectives and priorities. Once everyone agrees on the unified methodology used for prioritization of ideas, less time and energy will be wasted on trade-off discussions between any 2 ideas. This increases experimentation velocity, which enhances overall impact of your testing program.
Prioritize based on company impact- Use the overall business strategy as the first input on your prioritization framework (RICE or PIE) and layer in probability of success and resource intensity on top of that. Prioritizing based on customer asks carries an inherent risk as the customers’ needs change over time and users may not know they want a new feature until they experience it.
Diversify the experiments- To optimize resourcing, enhance learning and create a stable workload, alternate between experiments that are high risk and low risk, large and small in scope and geared towards learning and impact. This variety ensures that your team will experience a mix of learning opportunities and celebrating successes which helps them stay engaged, while continually delivering measurable business impact.
Find the persona/use case that will react- Experimenting on all customers will often lead to insignificant results since even if a small segment has a very strong response to the experiment, the change in the metric will get diluted by the larger overall base. Designing experiments for each of the use cases will ensure that strong signals are not lost in the noise and the team learns about the nuances of each persona.
Double down on successes, review failures for learnings- To gain the most from experimentation, teams need to create a high velocity of testing. The best way to keep the team moving is to treat the experiment roadmap like a game of battleship. If experiments in one area are showing promising results, the team must focus and optimize further in that area until they reach diminishing results. If one area seems to be hard to impact however, it might be a good time to get creative and start looking at a different angle.
Steadily increase the forcefulness- When introducing a new feature, it is a good idea to start with minimal additional friction to ensure that the value outweighs the loss in customer experience. If customers respond positively to the change and the feature’s value is validated, you can increase the forcefulness of the experience to require more customers to engage with the feature or make it a core step in their user journey.
Ideas are important but execution is the key- A good idea with bad execution will always fail and since there are many ways to test an idea, it is important to ensure that the execution has the right quality and context to make the hypothesis viable. The execution includes the design, placement, development and targeting of the experiment. A badly designed addition, or one that shows up at the wrong time during the customer journey is bound to fail but it’s important to not dismiss a hypothesis because of bad execution.

Finally, how can you be sure that you can trust the results?

Here are a few instances where your results may not be valid:

Contaminated data- When multiple experiments are running in parallel and test groups get mixed it becomes hard to tell which experiment impacted the change in the KPIs. Here is where the experimentation infrastructure becomes extremely important. It is best practice to tag customers per experiment variant, with rules prohibiting conflicting variants to be assigned to the same user.
Small sample size- A small data sample used to drive recommendations can easily be influenced by few outlier events. The required sample size for an experiment can be calculated by estimating the magnitude of change and variability of data. Using less than 95% statistical significance may lead you to believe a random coincidence will be a repeatable phenomenon. Obtaining appropriate sample size especially can be difficult for products that have low traffic, in which case fewer tests may be run over longer periods or qualitative studies may be the best way to gain insight.
Insufficient time- Running an experiment for a short amount of time (even with sufficient sample size) may lead you to miss the natural usage patterns of your product and only take into account peaks or troughs instead of a full usage cycle.
External events- Such as competitors’ sales, holidays and Macroeconomic events (such as a pandemic) will affect the viability of the data you collect. Try to avoid running experiments if you can predict such events or remove the data gathered from these periods from your analysis.

This is the part 2 of a series on Product experimentation and Growth. In the next article I will cover “How to create a powerful experimentation system?”

Part 1- The ultimate guide to experimentation for product teams

How to design high impact product experiments?

Firstly, how do you know when you need to experiment?

Here are 10 guidelines to design high impact experiments.

Finally, how can you be sure that you can trust the results?

Written by Negar Mokhtarnia 🚀