Reports that the Mac Pro and Mac Studio shared the same M2 Ultra chip, with the Mac Pro's only differentiator being seven PCIe slots most buyers never used. The aftermarket card ecosystem for Apple Silicon never materialized, making the tower form factor unnecessary.
Argues that Apple's M-series chips demolished the traditional workstation thesis by putting CPU, GPU, Neural Engine, and massive unified memory on a single package. The M4 Ultra's expected 256GB unified memory eliminates PCIe bus bottlenecks and discrete GPU memory copying, making the expandable chassis obsolete.
Highlighted that Apple's architecture is particularly well-suited for inference workloads, where large models need to fit entirely in memory accessible by both CPU and GPU at full bandwidth — a use case where the Mac Studio's unified memory pool outperforms traditional PCIe-based GPU setups.
Acknowledges that the 2019 Mac Pro served a real audience — video editors with RED workflows, audio engineers running massive Pro Tools sessions, and researchers who needed PCIe expandability. These users relied on discrete GPUs, FPGAs, and 10GbE NICs that the fixed Mac Studio form factor cannot accommodate.
Apple has confirmed to 9to5Mac that the Mac Pro is being discontinued. The tower workstation — which traced its lineage back through the Power Mac G5, the 2013 "trash can," and the 2019 cheese-grater revival — is done. No replacement has been announced. The current Mac Studio and Mac Pro shared the same M2 Ultra chip, and Apple apparently decided that maintaining two product lines for the same silicon was one product line too many.
The Mac Pro's final years were strange. The 2019 model was a genuine workstation: PCIe slots, MPX GPU modules, up to 1.5TB of RAM. It was expensive, but it served a real audience — video editors with RED workflows, audio engineers running massive Pro Tools sessions, researchers who needed expandability. When Apple moved to Apple Silicon, the Mac Pro lost its reason to exist. The M2 Ultra Mac Pro had exactly one advantage over the M2 Ultra Mac Studio: seven PCIe slots that most buyers never used, because the M-series chips didn't need discrete GPUs and the aftermarket card ecosystem for Apple Silicon never materialized.
The discontinuation is less interesting than what it reveals about where Apple thinks professional computing is heading. The traditional workstation thesis — give power users a box they can fill with specialized cards — assumed that the CPU was necessary but insufficient. You needed a GPU for rendering, maybe an FPGA for signal processing, a 10GbE NIC for storage networking. The Mac Pro was a chassis for that thesis.
Apple's M-series chips demolished that model by putting CPU, GPU, Neural Engine, and a massive unified memory pool on a single package. The M4 Ultra, expected later this year, will reportedly offer up to 256GB of unified memory accessible by both CPU and GPU cores at full bandwidth. No PCIe bus bottleneck. No discrete GPU memory copying. For inference workloads specifically, this architecture is nearly ideal.
As HN commenter chatmasta noted: "Apple really stumbled into making the perfect hardware for home inference machines. Does any hardware company come close to Apple in terms of unified memory and single machines for high throughput inference workloads?" The answer, right now, is no. NVIDIA dominates training and data-center inference, but for a single developer running a 70B parameter model locally, nothing matches what a maxed-out Mac Studio delivers. A 192GB unified memory Mac Studio can hold models that would require multiple consumer GPUs with complex VRAM management on any other platform.
The community reaction splits into two camps. One group, represented by commenters like readitalready, sees a massive missed opportunity: "Apple had every ability to make something competitive with Nvidia for AI training as well as inference, by selling high-end multi-GPU Mac Pro workstations as well as servers, but for some reason chose not to." The argument is that Apple could have bonded four or eight M-series dies together, built rack-mount configurations, and gone after the training market. They had the silicon design expertise, the manufacturing relationships with TSMC, and the software stack (Core ML, Metal). They chose not to.
The other camp argues Apple made the right call. Training is a hyperscaler game — Meta, Google, Microsoft, and a handful of startups are buying NVIDIA hardware by the tens of thousands. Apple entering that market would mean competing on software ecosystem (CUDA's moat is deep), selling to enterprises (not Apple's strength), and building server-class cooling and networking (not Apple's interest). By focusing the M-series on the single-machine, unified-memory sweet spot, Apple carved out a position no one else occupies: the best hardware for running large models locally on a device that fits on a desk.
The M5 generation adds another wrinkle. As noted in community discussion, the M5 Pro and Max chips have moved to a chiplet architecture, with CPU cores on one chiplet and GPU cores on another. This is the same direction AMD took years ago, and it suggests Apple is preparing for more flexible configurations. A hypothetical M5 Ultra could bond chiplets in ways that prioritize GPU core count for compute workloads, or memory bandwidth for inference. The architecture is getting more modular even as the product line gets simpler.
If you're a developer who was waiting for a next-gen Mac Pro to run local AI workloads, stop waiting. The Mac Studio is the product. For most inference use cases — running Llama, Mistral, or similar open-weight models via llama.cpp or MLX — the Mac Studio with an M4 Ultra and 192GB unified memory will outperform any consumer-grade multi-GPU rig on a per-dollar basis, with none of the driver headaches.
The practical ceiling is inference, not training. If your workflow involves fine-tuning models above 13B parameters or pre-training anything meaningful, you still need NVIDIA GPUs or cloud instances. Apple has shown zero interest in the training market, and MLX, while improving rapidly, doesn't match PyTorch + CUDA for training workloads. Plan your hardware budget with this split in mind: Mac Studio for local inference and development, cloud GPU instances for training runs.
For teams that relied on Mac Pro expandability — PCIe capture cards for video production, Avid HDX cards for audio — the transition is less clean. Thunderbolt external chassis exist but add latency and complexity. Apple is betting that these niche use cases either move to Thunderbolt-native hardware or shrink to irrelevance. If your studio depends on PCIe cards that have no Thunderbolt equivalent, start evaluating alternatives now rather than when your current Mac Pro dies.
The Mac Pro's death marks the end of Apple's belief that professional users need a big box they can open. The M5 Ultra Mac Studio, likely arriving at WWDC in June, will be the most powerful single-unit inference machine available to individual developers. Whether Apple eventually revisits the multi-die server market — perhaps for enterprise on-device AI deployments — remains an open question. But for now, the message is clear: Apple thinks the future of professional computing fits in a small aluminum square, and the 192GB of unified memory inside it is the spec that matters more than any expansion slot ever did.
I think that's an expected thing.G5 was the thing. And companies were buying G5 and other macs like that all the time, because you were able to actually extend it with video cards and some special equipment.But now we have M chips. You don't need video for M chips. You kinda do, but truthf
As someone who came from the SGI O2/Octane era when high-end workstations were compact, distinctive, and sexy, I’ve never really understood the allure of the Mac Pro, with the exception of the 2013 Mac Pro tube, which I owned (small footprint, quiet, and powerful).For me, aesthetics and size ar
Apple really dropped the ball here. They had every ability to make something competitive with Nvidia for AI training as well as inference, by selling high end multi GPU Mac Pro workstations as well as servers, but for some reason chose not to. They had the infrastructure and custom SoCs and everythi
The Ultra variants of the M series chips had previously consisted of two of the Max chips bonded together.The M5 generation Pro and Max chips have moved to a chiplet based architecture, with all the CPU cores on one chiplet, and all the GPU cores on another.https://www.wikipedia.org/w
Top 10 dev stories every morning at 8am UTC. AI-curated. Retro terminal HTML email.
I bet there’s gonna be a banger of a Mac Studio announced in June.Apple really stumbled into making the perfect hardware for home inference machines. Does any hardware company come close to Apple in terms of unified memory and single machines for high throughput inference workloads? Or even any DIY