The Market Street Journal, 1/19/25

Jan 19, 2025

Gemini image generation prompt: “market street journal like wall street journal but it's san francisco”

Last night I bought a custom domain name for this blog (an investment, I promise), but it’ll take a few more days for the DNS records to fully propagate so this will still be published as a vanilla Substack post because I’m trying to honor my ritual and don’t want to miss a Sunday. Fingers crossed this blog is as web-optimized as possible by next Sunday!

Market Street; “The Path of Gold” after the 1906 earthquake and more recently and colloquially “The Street They Keep Power-Washing When There’s a Big Conference in Town”. Today, Market is probably the single most culturally-encapsulating street in SF. It starts at the Ferry Building and Embarcadero to the east — symbolically and literally the root of commerce in SF. It passes through a gorgeous historical architectural section — City Hall, the symphony, the ballet, the opera house — in stark and sobering contrast to the activities that happen there. Finally, it juts down through the Castro before looping off into the quieter edges of the city.

Welcome to the first edition of the Market Street Journal, established in SF to report on the Modern AI Gold Rush ;)

Energy shifts in SF and US policy

This week on Market we saw even more Patagonia jackets than usual coming from the JPM conference in town. Nightly rates for hotel rooms in the city broke records by a factor of ten. (Some last-minute travelers were forced to book hotels in Oakland, which might have sounded cool until they realized they had to shove themselves into overcrowded transit every morning to make it into SF on time for meetings.)

Daniel Lurie was recently inaugurated as SF’s new mayor and showed us what Chinatown could actually look and feel like by putting on a night market rivaling the foot traffic of Times Square and booking Bay Area native ZHU.

Perplexity perplexed many by bidding to merge with TikTok and TikTok… did TikTok. The TikTok ban drove interesting user behaviors across social media apps, including an influx of users to RedNote and wider recognition that ByteDance’s influence extends far beyond TikTok (CapCut).

President Biden signed an executive order to “accelerate large-scale AI infrastructure development at government sites”, the “AI energy bill”, in tandem with the Biden administration preparing to transition the $52 billion Chips and Science Act to the incoming administration. Unsurprisingly, TSMC beat Q4 earnings expectations citing “surging demand for advanced chips used in AI applications” as well as boosts to NVIDIA and Tesla. AI chips powering self-driving cars at scale is still a longer-horizon bet but futures traders should take note. Anecdotally, Waymo is picking up traction as a legitimate price/wait time competitor to Uber and Lyft in SF (beyond just being a tourist attraction).

Major moves in multimodal

Shanghai-based MiniMax (valued at $2.5B+) released three new models: MiniMax-Text-01, MiniMax-VL-01, and T2A-01-HD, for text, image/text, and audio. MiniMax claims 456B parameter MiniMax-Text-01 beats Gemini 2.0 Flash on MMLU / SimpleQA, that MiniMax-VL-01 can compete with Claude 3.5 Sonnet on multimodal e.g. ChartQA, and TechCrunch assesses T2A-01-HD to be comparable with Meta’s audio models.

London-based AI video generator Synthesia (valued at $2.1B) backed by NVIDIA raised $180M (doubling its valuation since 2023) in a round led by NEA. Runway is another player in the AI video space I’ve been following for a while. Generative multimodal technology, especially audio/visual, is similarly characterized as self-driving automotive technology as having reasonably functional iterations in production scaled to end customers, but with a long path ahead to achieve parity with human-made, human-driven experience. Arguably self-driving is a step ahead of AI audiovisual, but that may soon change with the lower cost and energy barrier to entry for multimodal model development as I mentioned in my last post.

The meme economy and crypto

$TRUMP (launched Friday on Solana) hit $13B market cap; Solana still trails by a factor of at least 5x behind Ethereum in TVL but as of today beats Ethereum by almost 10x in last 24h volume. GameStop, Taylor Swift, and now unironically Solana meme coin Fartcoin — money is as unserious now as it’s ever been, which should be the loudest signal for us to get real and start talking about stablecoin.

In addition to the natural use case of fiat-, precious metal, or other asset-backed cryptocurrency that theoretically solves the volatility problem making Bitcoin and Ethereum more amenable as trading instruments than actual reliably-valued currency to exchange for goods and services, Chainalysis now reports stablecoin has officially surpassed Bitcoin as the currency dominating the $50B+ cybercrime industry (malware, scams, stolen funds, darknet, and ~$10B in FTX creditor claims in 2022), taking the majority (63%) of all illicit transaction volume at 77% YoY growth.

Before we get too hot on stablecoin, let’s revisit its monetization mechanisms. When Stripe bought stablecoin infra startup Bridge for $1.1B last year, MIT Cryptoeconomics Lab founder Christian Catalini highlighted in Forbes:

Relying solely on reserve interest isn’t a sustainable way to monetize a stablecoin… Stablecoin issuers like Circle and Tether [N.B. the two largest stablecoins by market cap and both backed by USD, although Tether has faced skepticism] seem to overlook that today’s high-interest environment is an anomaly, and a sustainable business can’t be built on a foundation that’s likely to crumble when market conditions shift.
…what options do stablecoin issuers have? Unless they’re relying on temporary regulatory loopholes—which are unlikely to hold long-term (more on that in the next section)—they’ll need to start competing with their own customers.
…Stripe doesn’t face this dilemma. As one of the world’s most successful payments companies, they’ve mastered the art of deploying and monetizing a streamlined software layer on top of global money movement—a model that scales efficiently through network effects without being slowed down by the need for country-specific banking licenses. Stablecoins accelerate this approach by acting as a bridge between Stripe and domestic banking and payment rails. What was once a network constrained by legacy institutions—including card companies—can now overcome its last-mile problem, delivering significantly more value to merchants and consumers.

Editor’s desk

You didn’t really think you’d get away without a paper dump from me, did you?

Last week I put A Hitchhiker’s Guide to Scaling Law Estimation on my reading list. This 2024 paper from Leshem Choshen, Yang Zhang, and Jacob Andreas at MIT and IBM compiled and released “a large-scale dataset containing losses and downstream evaluations for 485 previously published pretrained models” including Llama and GPT, used these to “estimate more than 1000 scaling laws”, and “derive[d] a set of best practices for estimating scaling laws in new model families”. Recall model loss measures distance between a model’s predictions and the actual labels, hence we generally want to minimize loss when training models, but the core tradeoffs with computationally-intensive training methods such as deep learning are in cost and compute. This paper is important because it established the largest public dataset describing scaling behavior across model families and a generic mathematical framework to define a set of scaling laws estimating performances of new models f given model families F. Publishing the “secret sauce” here and opening it up to public contribution is an admirable and influential way to empower researchers and practitioners alike to move faster and cheaper in the otherwise exponentially more complex, compute- and cost-intensive space of deep model training.

Chirag Shah and Ryen White at UW and Microsoft submitted Agents Are Not Enough (Dec 2024) to ACM Proceedings, which felt like a breath of fresh air to read, especially in the current industry wave of every remotely e-commerce application releasing a AI customer service chatbot and marketing emails put together so haphazardly that they forget to replace “[Your Name]” with my actual name:

While generative AI is appealing, this technology alone is insufficient to make new generations of agents more successful. To make the current wave of agents effective and sustainable, we envision an ecosystem that includes not only agents but also Sims, which represent user preferences and behaviors, as well as Assistants, which directly interact with the user and coordinate the execution of user tasks with the help of the agents.

Shah and White describe a number of agent development waves starting in the 1950s to the 2000s, including multi-agent systems (still significant development here, including a recent benchmark for multimodal multi-agent systems in Minecraft and Altera AI also using Minecraft as a simulation space for large-scale agentic behavioral studies) and cognitive architectures like Soar from CMU. The primary concerns and obstacles to continued widespread societal AI agent adoption they raise are in scalability and robustness (predictable) and ethics and safety (predictable but still needs way more research attention as I’ve been saying for months). The notable difference this paper makes is in actually proposing an expanded paradigm for developing agentic AI, e.g. a world of Agents, Sims, and Assistants. No doubt, large-scale social media and customer applications that run on user preferences and monetize targeted ads formulated “Sims” first at such scale (e.g. using user profiles, preferences, settings in recommender systems) but what they haven’t yet popularized in the literature is this idea of “Assistants” that operate in that space between “Sims” and “Agents”.

An Assistant is a program that directly interacts with the user, has a deep understanding of that user, and has an ability to call Sims and Agents as needed to reactively or proactively accomplish tasks and sub-tasks for the user. In this regard, an Assistant is a private version of an agent that can have access to a user’s personal information and could be fine-tuned to that user, allowing it to act on the user’s behalf.

Just last night I was imagining a world where I could deploy AI versions of my online behaviors (modeled after my existing behaviors on Twitter and Substack, for example) that could perhaps not write for me (because I like writing so much and am not looking to automate that away) but interact with Tweets in a way true to my behavior, go out and find Substack posts and papers that I would find interesting, tailor my social media algorithms for me at scale and help me find inspiration in potentially 1/10 or 1/100th of my current time spent. Those would effectively be my Assistants in this Agent/Sim/Assistant model. Watch this space :)

Val's Tech Blog

Discussion about this post