Infrastructure Opportunities in AI
Increasingly customers are looking to optimize for speed, shifting workloads from more serial CPUs to parallel processing GPUs creating a bottleneck on the availability of GPUs. This has created 3 opportunities:
Optimization: Companies are worried about cost and usage optimization of their GPUs. How do I get more out of my existing assets?
Procurement: finding additional AI assets is becoming incredibly difficult and competitive. Companies want GPUs but can't find them.
Portability: the shortage of GPUs opens the door to NVIDIA competitors. As more products come to market, portability from one GPU to another is paramount.
Infrastructure management continues to evolve, not replicating but certainly rhyming with previous cycles. Going back to mainframes, there is always a need to do cost and usage optimization. The ability to monitor, manage, and govern assets across the enterprise is mission critical. This remains true in the age of data, specifically artificial intelligence where workloads are massive and cost prohibitive for most organizations.
Procurement is part real estate, part chip provider problem. Portability is most likely something that gets solved by emerging chip providers looking to capture market share from NVIDIA.
Related Concepts
For those interested in learning about older infrastructure management concepts, take a look at the following:
Infrastructure Management
IT Asset Management
Application Resource Management
Observability
Within the above segments, there are a number of legacy players that were architected 5-10 years ago and aren't 1) injecting AI into their offerings and 2) providing more comprehensive AI management solutions.
Infrastructure management is no longer just about about data aggregation and insights, it should also include more complete automation of the asset management.