Estimating the Costs of Storing Immense Data: The Google Earth Example

Introduction

For aspiring product managers aiming to succeed in interviews with top firms such as FAANG companies, it is crucial to demonstrate problem-solving prowess and structured thinking. This segment delves into a typical question candidates may encounter: estimating the cost of storing Google Earth photos. The sign of a strong PM candidate is the ability to break down complex problems using frameworks, which will be showcased in our approach to this question.

Detailed Guide on Framework Application

Choosing a Framework

For this estimation task, we’ll select the Fermi Estimation framework. It helps in breaking down the seemingly intractable problems into manageable chunks that can be individually estimated and then combined to form a total estimate.

Step-by-Step Guide with Examples

Now, let’s apply the Fermi Estimation framework to estimate the cost of storing Google Earth photos:

  1. Define the problem: Start by understanding that Google Earth imagery includes 2D and 3D images at various resolutions and dates. We need to estimate the storage these images would require and then the cost associated with that storage.
  2. Break the problem into parts: Consider the resolution of images, coverage area, file size for each image, types of images (2D/3D), and the redundancy for data reliability.
  3. Estimate each part: Using publicly available data, assume an average image resolution, calculate the area of the Earth’s land surface, estimate the average file size, and account for image count due to multiple imagery per location and 3D data. Remember, it’s better to overestimate costs and data needs than underestimate in a business context.
  4. Aggregate estimates: Multiply the estimated numbers for the above sections to find a total data storage requirement. For instance, if we assume the land surface of Earth is 150 million square kilometers and one square kilometer is covered by ten high-resolution images averaging 100 MB each, we get approximately 150 million km² * 10 images/km² * 100 MB/image.
  5. Estimate the storage cost: Look at current rates for large-scale data storage in cloud services, considering Google uses its proprietary storage solutions, the internal cost might be lower. Research the market rates anyway for a comparison point.

We can use benchmarks and proxies when exact data isn’t available:

  • A high-resolution satellite image might be around 100 MB.
  • Assume data redundancy at a rate of 2x, for backups and fault tolerance.
  • Cloud storage costs can be researched from current providers like AWS or Google Cloud.

After aggregating our data, we may estimate that the storage requirement is around 200 petabytes. If the market rate is $20 per TB per month, then the monthly cost could be estimated at $4 million. Adjust this number based on the unique circumstances Google might have, such as economies of scale and discounts.

Communication Tips

Express humility about the assumptions and clarify that real-world data might shift these estimates. Speak methodically, showing your work step-by-step, to demonstrate transparency in your thought process.

Conclusion

In conclusion, approaching the storage cost estimation for a massive and intricate dataset like Google Earth enables PM candidates to exhibit their analytical and strategic thinking skills. By applying the Fermi Estimation framework, one can arrive at an informed estimate that demonstrates both cognitive flexibility and a logical approach to problem-solving. Practice this framework to bolster your confidence and proficiency in handling similar product management interview questions.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top