OpenAI's "Orion" Model Advances Challenge AI Scaling Laws

Nov 11, 2024

The next flagship model from OpenAI, named "Orion," is showing slower progress compared to its predecessors, challenging the long-held "scaling laws" in AI. This observation has prompted a shift in focus towards post-training improvements of models in the industry.

As AI products like ChatGPT continue to gain users, the development speed of the underlying technology, large language models (LLMs), appears to be decelerating. OpenAI's "Orion" model has completed 20% of its training and is performing similarly to the current GPT-4, but its progress doesn't match the leap seen in the previous generations.

Reports indicate that while Orion shows enhanced capabilities in language tasks, it might not outperform previous models in areas like coding. The operational costs for Orion at OpenAI's data centers may also be higher compared to recent models.

Historically, LLMs have used publicly available texts from websites and books for pre-training, which has led to limited quality improvements despite alleviating data scarcity. Orion's training incorporates AI-generated data from models like GPT-4, presenting a challenge as it might resemble older models in some aspects.

Other AI companies face similar hurdles. Both Meta's founder, Mark Zuckerberg, and Databricks' Ion Stoica have noted that while advancements continue in areas like coding and complex problem-solving, improvements on commonsense reasoning and general task capabilities have slowed.

The slowdown of Orion challenges the traditional AI scaling laws, which predict substantial performance improvements with increased data and computational resources. The industry is now exploring post-initial training improvements to potentially create new scaling paradigms.

The reduction in high-quality training data and rising computational costs have pushed OpenAI researchers to explore alternative methods to enhance model performance. Efforts include embedding more code-writing features and developing software that automates tasks like web browsing.

OpenAI has formed a team led by Nick Ryder, previously responsible for pre-training, to optimize training data and refine scaling methods. This team is working on improving the model's problem-solving capabilities through reinforcement and human evaluation feedback.

The financial burden of high computational costs is significant, with inference costs for the o1 model being six times that of standard models. OpenAI and others still invest heavily in data centers to leverage increased computing power for potential gains from pre-trained models.

At the TEDAI conference, OpenAI researcher Noam Brown warned about the financial implications of developing more advanced models, questioning the feasibility of spending billions on training.

In the future, AI companies like OpenAI must balance the trade-offs between training data and computational resources to optimize model performance without incurring prohibitive financial costs.

Disclosures

I/We may personally own shares in some of the companies mentioned above. However, those positions are not material to either the company or to my/our portfolios.

Research Tools

All-In-One Screener Stock Ideas Stock List Guru List Guru Real-Time Picks Insider List Insider Trades Economic Indicators Sector & Industry Performance DCF Calculator Discussion Board

Product

Pricing Plans Excel Add-In Google Sheets Add-on Data API Stock Comparison Table Manual of Stocks Instant Alerts Mobile App 中文

Education

Financial Glossary Tutorials FAQ Schedule Free Session Buffett Indicator Shiller P/E Yield Curve Today U.S. Inflation Rate Global Market Valuation Fed Net Liquidity Buffett Assets Allocation

Company

About GuruFocus Career Contact Us Advertise Term of Use Privacy Policy Referral Program Partner Program

Survey

We'd love to learn more about your experiences on GuruFocus.com and how we can improve!

Take Survey

Disclaimers

GuruFocus.com is not operated by a broker or a dealer. Under no circumstances does any information posted on GuruFocus.com represent a recommendation to buy or sell a security. The information on this site, and in its related newsletters, is not intended to be, nor does it constitute investment advice or recommendations. The individuals or entities selected as "gurus" may buy and sell securities before and after any particular article and report and information herein is published, with respect to the securities discussed in any article and report posted herein. Gurus may be added or dropped from the GuruFocus site at any time. In no event shall GuruFocus.com be liable to any member, guest or third party for any damages of any kind arising out of the use of any content or other material published or available on GuruFocus.com, or relating to the use of, or inability to use, GuruFocus.com or any content, including, without limitation, any investment losses, lost profits, lost opportunity, special, incidental, indirect, consequential or punitive damages. Past performance is a poor indicator of future performance. The information on this site, and in its related newsletters, is not intended to be, nor does it constitute investment advice or recommendations. The information on this site is in no way guaranteed for completeness, accuracy or in any other way. The gurus listed in this website are not affiliated with GuruFocus.com, LLC. Stock quotes are provided by QuoteMedia, Inc. (CSI). Company fundamental data is provided by Morningstar. Analyst estimates data is sourced from both Refinitiv and Morningstar, with priority given to Refinitiv data. Data is updated daily.