Innovation and Technology
Big AI Inference Has Become a Big Deal and a Bigger Business
Cerebras Takes Inference To a New Level
Cerebras Systems, the creator of wafer-scale, Frisbee-sized AI chips, has rolled out a plan to build six new data centers since entering the “high-value” token business. The company claims it will become the largest provider of such inference services globally by the end of this year. The new data centers are partially up and running today and will soon expand to France and Canada. The aggregate capacity of these systems, which will number in the thousands, will exceed 40 million Llama 70B tokens per second.
High-Value Tokens
High-value tokens carry more contextual information and are typically more important for understanding the overall meaning of a text. They often represent key concepts, rare words, or specialized terminology. High-value tokens consume more computational resources and may cost more to process. This is because they typically require more attention from the model and contribute more significantly to the final output. Low-value tokens, which are more common and less informationally dense, usually require fewer processing resources. Clearly, Cerebras is targeting problems that are a good fit for its wafer-scale approach to AI.
The Inference Revolution is Just Beginning
Next week, we will hear more about “high-value” tokens from Nvidia at GTC, as the inference market overtakes training in global revenue. Markets such as autonomous vehicles, robots, and sovereign data centers all depend on fast inference, and Nvidia does not plan to let that market pass them by. The high-value concept is new, and platforms like Cerebras and Nvidia LVL72 are ideal for delivering it.
Achieving High-Performance Inference
Cerebras is 30 times faster and 90% cheaper due to its wafer-scale architecture. This level of performance in delivering high-value tokens is attracting new enterprise customers that also need elastic services to meet their needs. AlphaSense, for example, a leading market intelligence platform, has moved to Cerebras Inference, replacing a top-three closed-source AI model provider. The company has also landed Perplexity, Mistral, Hugging Face, and other users of high-value inferencing, delivering inference performance 10 to 20 times faster than alternatives.
Conclusion
Cerebras’ recent announcement marks a significant milestone in the development of AI inference technology. With its wafer-scale architecture, Cerebras is poised to become the largest provider of inference services globally by the end of the year. As the inference market continues to grow, we can expect to see more innovations and advancements in this space. Cerebras’ focus on high-value tokens and its ability to deliver fast and efficient inference services make it an attractive option for enterprises looking to leverage AI for their business needs.
FAQs
What is Cerebras Systems? Cerebras Systems is the creator of wafer-scale, Frisbee-sized AI chips.
What is high-value token? High-value tokens carry more contextual information and are typically more important for understanding the overall meaning of a text.
How does Cerebras achieve high-performance inference? Cerebras is 30 times faster and 90% cheaper due to its wafer-scale architecture.
What is the significance of the inference market? The inference market is expected to surpass the training market in global revenue, with applications in autonomous vehicles, robots, and sovereign data centers, among others.
Who are Cerebras’ clients? Cerebras’ clients include Baya Systems, BrainChip, Cadence, Cerebras Systems, D-Matrix, Esperanto, Flex, Groq, IBM, Intel, Micron, NVIDIA, Qualcomm, Graphcore, SImA.ai, Synopsys, Tenstorrent, Ventana Microsystems, and scores of investors.
-
Resiliency7 months agoHow Emotional Intelligence Can Help You Manage Stress and Build Resilience
-
Career Advice1 year agoInterview with Dr. Kristy K. Taylor, WORxK Global News Magazine Founder
-
Diversity and Inclusion (DEIA)1 year agoSarah Herrlinger Talks AirPods Pro Hearing Aid
-
Career Advice1 year agoNetWork Your Way to Success: Top Tips for Maximizing Your Professional Network
-
Changemaker Interviews1 year agoUnlocking Human Potential: Kim Groshek’s Journey to Transforming Leadership and Stress Resilience
-
Diversity and Inclusion (DEIA)1 year agoThe Power of Belonging: Why Feeling Accepted Matters in the Workplace
-
Global Trends and Politics1 year agoHealth-care stocks fall after Warren PBM bill, Brian Thompson shooting
-
Changemaker Interviews12 months agoGlenda Benevides: Creating Global Impact Through Music
