• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Apple's upcoming CPU-architecture change: ARM-based "Apple Silicon"

MacBook Air M1 review: Windows laptops are so screwed
ut are these new Macs really better than Intel-powered ones?

1,000 percent yes!

All of the hype about the 8-core M1 chip in the MacBook Air — up to 3.5x faster CPU and up to 5x faster GPU performance, and almost double the battery life compared to the previous-gen Intel MacBook Air — is real. I’ve been using the entry-level $999 MacBook Air with 8GB of RAM and 256GB of storage nonstop as my only computer for a week and it still doesn’t feel possible that a laptop this thin and this light is capable of all this power and battery life. It makes my 2019 13-inch MacBook Pro with Intel Core i5 and 16GB of RAM look like smoldering trash now.

What Apple has achieved with the M1 is nothing short of groundbreaking. It pains me there’s still no touchscreen, there are only two USB-C ports, there’s no SD card slot, and I know I'm bound to run into some apps that don’t emulate well (or at all) with Rosetta 2 (macOS Big Sur’s x86 Intel app translator), but these are all trivial issues.

The M1 MacBook Air (and M1 MacBook Pro) are now the best laptops regardless of operating system. They’re the new gold standard by which all laptops will be judged, and this is just the start. In a few years, we’ll look back and wonder how we ever tolerated laptops with anything less than this kind of performance.
Let's see how the PeeCee world reacts to it.

Microsoft is adding x64 emulation to Windows on ARM
Yesterday, Microsoft officially announced that it’s working on an x64 emulation for Windows on ARM, which will pave the way for up-to-date versions of applications like the Adobe Creative Suite to finally work on the platform.

“We will also expand support for running x64 apps, with x64 emulation starting to roll out to the Windows Insider Program in November,” Microsoft Chief Product Officer Panos Panay said in the announcement.
Seems like M$ will have to do a lot of catching up.

In Linux-land,
PINEBOOK Pro | PINE64
A Powerful, Metal and Open Source ARM 64-Bit Laptop for Work, School or Fun

The Pinebook Pro is meant to deliver solid day-to-day Linux or *BSD experience and to be a compelling alternative to mid-ranged Chromebooks that people convert into Linux laptops. In contrast to most mid-ranged Chromebooks however, the Pinebook Pro comes with an IPS 1080p 14″ LCD panel, a premium magnesium alloy shell, 64/128GB of eMMC storage* (more on this later – see asterisk below), a 10,000 mAh capacity battery and the modularity / hackability that only an open source project can deliver – such as the unpopulated PCIe m.2 NVMe slot (an optional feature which requires an optional adapter). The USB-C port on the Pinebook Pro, apart from being able to transmit data and charge the unit, is also capable of digital video output up-to 4K at 60hz.
 
 ARM architecture and  AArch64 - 64-bit extension of the ARM architecture

ARM = Acorn RISC Machine, then Advanced RISC Machine

It goes back to  Acorn Archimedes (late 1980's) made by  Acorn Computers

ARM chips have been used in oodles of game consoles, PDA's, cellphones, tablets, and other such devices.

The ARM architecture is classified as RISC because of several features.  Reduced instruction set computer

It has a load/store architecture, with instructions that only do loading from main memory and storing to it, and no other instructions doing those actions.

It has limited or no support for misaligned memory accesses. Aligned: 16-bit on 2-byte boundaries, 32-bit on 4-byte boundaries, 64-bit on 8-byte boundaries.

It has fixed-sized 32-bit-long instructions, though it has a Thumb mode where it can use 16-bit-long instructions to save memory.

Its "register file" is 16 32-bit registers (32-bit versions) or 31 64-bit registers (64-bit versions)

Some versions of ARM chips don't have hardcoded divide instructions, though all versions have hardcoded add, subtract, and multiply ones.

So it looks like we are seeing RISC on the desktop again.
 
AWS engineer puts Windows 10 on Arm on Apple Mac M1 – and it thrashes Surface Pro X | ZDNet - "A virtualized Windows 10 on Arm runs faster on Apple's M1 hardware than on Microsoft's own Arm-based Surface Pro X."
ut Microsoft's reluctance to create a license for Windows 10 on Arm for end users hasn't stopped creative engineers from putting together a working example of what things could be like if it did.

AWS principal engineer Alexander Graf did just that, using the open-source QEMU virtualization software for Windows on Arm. QEMU emulates access to hardware such as the CPU and GPU. Graf's work was spotted by The 8-bit, via 9to5Mac.
ARM Windows M1 Mac virtualization demonstrated - 9to5Mac

Developer successfully virtualizes ARM Windows on Apple Silicon | The 8-Bit

Alexander Graf on Twitter: "Who said Windows wouldn't run well on #AppleSilicon? It's pretty snappy here 😁. #QEMU patches for reference: (links)" / Twitter
with a screenshot and QEMU patches - Patchwork
 
Apple isn't stopping with the M1 chip.

Apple (AAPL) Preps Next Mac Chips With Aim to Outclass Highest-End PCs - Bloomberg
Apple’s Mac chips, like those in its iPhone, iPad and Apple Watch, use technology licensed from Arm Ltd., the chip design firm whose blueprints underpin much of the mobile industry and which Nvidia Corp. is in the process of acquiring. Apple designs the chips and outsources their production to Taiwan Semiconductor Manufacturing Co., which has taken the lead from Intel in chip manufacturing.

The current M1 chip inherits a mobile-centric design built around four high-performance processing cores to accelerate tasks like video editing and four power-saving cores that can handle less intensive jobs like web browsing. For its next generation chip targeting MacBook Pro and iMac models, Apple is working on designs with as many as 16 power cores and four efficiency cores, the people said.
20 cores on a chip.

Apple could scale back release chips with 8 or 12 high-performance cores instead of 16.

For higher-end desktop computers and Mac Pro models, Apple is working on chips with as many as 32 high-performance cores.

Currently with Intel-x86 chips, Apple's highest-end laptops can have as many as 8 cores, a high-end iMac Pro 18 cores, and a high-end Mac Pro 28 cores.

AMD, Intel's main competitor with the x86 architecture, offers desktop CPU chips with as many as 16 cores, and high-end gaming-PC ones 64 cores.

Turning to graphics processors, the M1 includes an 8-core GPU, and for high-range laptops and mid-range desktops, Apple is testing 16-core and 32-core GPU's, and eventually 64-core and 128-core GPU's.
 
Apple is also planning on a "half-sized" Mac Pro.

That would fill in a gap between Apple's Mac Mini and Mac Pro. That is a gap that has existed in Apple's product line for a long time.

I think that a problem is Steve Jobs's design ideology, for lack of a better term. He likes all-in-ones. The Lisas and earliest Macs were all AIO's, but when SJ for forced out of Apple in 1985, Apple started offering what the PeeCee world had long had: boxes.

But in NeXT, SJ indulged in his love of AIO's, coming out with his NeXT Cube.

He returned to Apple in 1997, and he got into AIO's again with the iMac line. Apple introduced some high-end computers with little expandability, like a cylindrical Mac Pro, but they didn't do well, and Apple has made its most recent Mac Pro more expandable.

While it's OK to have low-end computers have little or no expandability, high-end users like expandability, and it's good that Apple's getting away from SJ's design ideology there.
 
There is a advantage to non-expandable RAM. It consumes less power and faster, sometimes ridiculously faster (HBM)
I think eventually these things will outweigh inability to expand.
 
There is a advantage to non-expandable RAM. It consumes less power and faster, sometimes ridiculously faster (HBM)
I think eventually these things will outweigh inability to expand.
 High Bandwidth Memory
HBM achieves higher bandwidth while using less power in a substantially smaller form factor than DDR4 or GDDR5.[7] This is achieved by stacking up to eight DRAM dies (thus being a Three-dimensional integrated circuit), including an optional base die (often a silicon interposer[8][9]) with a memory controller, which are interconnected by through-silicon vias (TSVs) and microbumps. The HBM technology is similar in principle but incompatible with the Hybrid Memory Cube interface developed by Micron Technology.[10]
 
Apple’s M1 is a fast CPU—but M1 Macs feel even faster due to QoS | Ars Technica
QoS = Quality of Service
"Howard Oakley did an excellent deep dive on M1 scheduling and performance."
In How M1 Macs feel faster than Intel models: it’s about QoS – The Eclectic Light Company

Back to Ars Technica.
There's a very common tendency to equate "performance" with throughput—roughly speaking, tasks accomplished per unit of time. Although throughput is generally the easiest metric to measure, it doesn't correspond very well to human perception. What humans generally notice isn't throughput, it's latency—not the number of times a task can be accomplished, but the time it takes to complete an individual task.

...
When Oakley noticed how frequently Mac users praised M1 Macs for feeling incredibly fast—despite performance measurements that don't always back those feelings up—he took a closer look at macOS native task scheduling.

MacOS offers four directly specified levels of task prioritization—from low to high, they are background, utility, userInitiated, and userInteractive. There's also a fifth level (the default, when no QoS level is manually specified) which allows macOS to decide for itself how important a task is.
Apple's M1 chips have 4 performance CPU cores and 4 efficiency CPU cores. Background tasks are assigned to the efficiency ones, and interactive ones to the performance cores. That makes interaction very snappy, since background tasks don't interfere with interactive ones.
 
In the day the battle for market dsare between Intel and Motorola was legendary. One of the things that made Intel entrnched in PCs and other apps was maintaining backward compatibility. The x86 mode.

You can run old x86 software with little effort.

ARM has been arund for a while and believe its roots are in Motororola.

ARM does not manufacture. they license designs.

Microcomputers migrated to ARM. It means a common instruction set.

I'd think this will potentially throw a wrench into apps developers. I wonder if this is another Apple aggressive domination move.
 
 List of iOS and iPadOS devices - Apple has been using ARM chips since the first iPhone, back in 2007, starting with a Samsung chip and continuing with its A series.

I've also found  Mac transition to Apple silicon and  Comparison of current Macintosh models

Which chips?
  • 2020 Nov 17: M1 (MacBook Air, MacBook Pro 13", Mac Mini)
  • 2021 May 21: M1 (iPad Pro, iMac)
  • 2021 Oct 18: M1 Pro, M1 Max (MacBook Pro 14", 16")
  • 2022 Mar 8: M1 (iPad Air), M1 Max, M1 Ultra (Mac Studio)
The Mac Pro and some Mac Minis are the only part of Apple's lineup that still uses an Intel-x86 CPU.

THe M1 Ultra chip is two M1 Max chips with a thin interconnect strip between them, thus making the two chips act like a single super chip.

Another feature of the M1 is that the RAM is in the same package as the CPU chip. That likely improves performance and reduces cost, though at the cost of less flexibility.


 Apple silicon covers not only the M1 series but several other ARM-based chips that Apple has used over the years, like most of those in the various models of iPhone and iPad.

 Apple M1 and  Apple M1 Pro and M1 Max which are  System on a chip - CPU, memory, GPU, etc.
 
 Apple silicon -- one kind of CPU core, or else two kinds (fast, slow) with the same cache sizes. L1 is level 1, L2 level 2, L3 level 3, SLC system level cache (for the entire chip). L1 is split up into L1i for instructions and L1d for data. In the later ones at least (A14, A15, M1) each type of core shares the L2 cache.

Chip# CoresL1i KBL1d KBL2 MBL3/SLC MB
(A1,2)11616
(A3)132320.25
A4132320.5
A5,5X,6,6X232321
A7,82646414
A8X3646424
A92646434
A9X264643
A102 + 2646434
A10X3 + 3646484
A112 + 4646484
A122 + 412812888
A12X,12Z4 + 412812888
A132 + 4128128816

With separate cache amounts for fast and slow cores
Chip# FastL1i KBL1d KBL2 MB# SlowL1i KBL1d KBL2 MBSLC
A14219212884192128416
A152192128124192128432
M1419212812412864416
M1 Pro6, 819212824212864432
M1 Max819212824212864464
 
I like the CPU-architecture code names:
ChipCodename
A7Cyclone
A8,8XTyphoon
A9,9XTwister
ChipFastSlow
A10,10XHurricaneZephyr
A11MonsoonMistral
A12,12X,12ZVortexTempest
A13LightningThunder
A14 M1FirestormIcestorm
A15AvalancheBlizzard
 
I'll now turn to the GPU's. In the A series, Apple used PowerVR GPU's, while in M1, Apple uses its own design. EU = execution unit, ALU = arithmetic-logical unit.

ChipMake# Cores# EU/core# ALU/EU# AI cores
(A1,2)PVR118
(A3),A4PVR128
A5,5X,6PVR2,4,328
A6XPVR448
A7,8,8X,9PVR4,8,6,1248
A10,10XPVR6,1248
A11APL3882
A12,12X,12ZAPL4,7,8888
A13APL4888
A14APL416816
A15APL4,532816
M1APL7,816816
M1 ProAPL14,1616816
M1 MaxAPL24,3216816
 
A M1 Ultra thus has 16 fast CPU cores, 4 slow CPU cores, as many as 64 GPU cores and 32 AI cores. The GPU cores in turn have a total of 1024 execution units and 8192 arithmetic-logic units.

Impressive parallelism.

Apple calls the fast cores "performance cores" and the slow cores "efficiency cores", since the slow ones are designed for low power consumption.
 
A notable feature of the  Mac Studio is that it's not a new iMac but a monitor-less box like the Mac Mini and the Mac Pro.

Seems like Apple is moving away from Steve Jobs's all-in-one design ideology. Though not expandable, the Mac Studio has a lot of ports on it: 2 USB-A ports, 2 USB-C 3.2 Gen 2, 4 Thunderbolt 4 USB-C 4.0 ports with HDMI 2.0, two 10Gb Ethernet ports, and a SD-card slot.  Comparison of current Macintosh models

The M1 chip is a  System on a chip but it is packaged with RAM to make a  System in a package

Perusing what Apple offers, I found out how much memory the chip packages can enclose. I've also included the size of each chip and how many transistors.
ChipRAM: GBDie size: mm^2Transistors: billion
M18, 1612016
M1 Pro16, 3224533.7
M1 Max32, 6443257
M1 Ultra64, 128864114
 
Since I last posted,  Apple silicon has gone to second and third generations: M2, M2 Pro, M2 Max, M2 Ultra, and M3, M3 Pro, M3 Max. So I'll update the tables.

This series is used in iPhones and some iPads:

Chip# FastL1i KBL1d KBL2 MB# SlowL1i KBL1d KBL2 MBSLC
A102646432323214
A10X3646483323214
A112646482323214
A12212812882323228
A12X212812884323228
A12Z212812884323228
A1321921288412864416
A1421921288412864416
A15219212812412864432
A16219212816412864424
A17219212816412864424

For some reason, most of these chips have more slow cores than fast cores. Could that be because these chips are for battery-powered devices?

Also, the A16 architectures are codenamed Everest and Sawtooth.
 
Last edited:
Now the Mac and some iPad chips. The M1 and M2 Ultra chips are two M1 or M2 Max chips connected to each other at their edges, and packaged in one chip package. There is no mention of a M3 Ultra, though Apple may eventually introduce one.

Chip# FastL1i KBL1d KBL2 MB# SlowL1i KBL1d KBL2 MBSLC
M141921281241286448
M1 Pro6, 819212824212864424
M1 Max819212824212864448
M1 Ultra1619212848412864896
M221921281641286448
M2 Pro619212832412864424
M2 Max819212832412864448
M2 Ultra1619212864812864896
M34192128164128644
M3 Pro5, 6192128326128644
M3 Max10, 12192128324128644

Kinds of memory:
  • CPU registers
  • L1i - level-1 instruction cache
  • L1d - level-1 data cache
  • L2 - level-2 cache
  • SLC - system-level or level-3 cache
  • RAM
  • Persistent storage: disk drives or flash memory
As one goes down the list, the memory gets more and more capacious, but also slower and slower. There is a tradeoff between (1) speed and (2) expense and power consumption
 
Now the GPU's, the graphics accelerators. In the A series, Apple used PowerVR GPU's, and later its own design. In M1, Apple uses its own design. EU = execution unit, ALU = arithmetic-logical unit.

ChipMake# Cores# EU/core# ALU/EU# AI cores
(A1,2)PVR118
(A3),A4PVR128
A5,5X,6PVR2,4,328
A6XPVR448
A7,8,8X,9PVR4,8,6,1248
A10,10XPVR6,1248
A11APL34162
A12,12X,12ZAPL4,7,84168
A13APL44168
A14APL441616
A15APL4,543216
A16APL543216
A17APL643216
 
Now the M series.
ChipMake# Cores# EU/core# ALU/EU# AI cores
M1APL7,843216
M1 ProAPL14,1643216
M1 MaxAPL24,3243216
M1 UltraAPL48,6443232
M2APL8,1043216
M2 ProAPL16,1943216
M2 MaxAPL30,3843216
M2 UltraAPL60,7643232
M3APL8,1016816
M3 ProAPL14,1816816
M3 MaxAPL30,4016816

The M3 has a design shift, toward using more EU's per core, but fewer ALU's per EU. The number of AI cores has stayed constant, however.
 
ChipRAM: GBDie size: mm^2Transistors: billion
M18, 1612016
M1 Pro16, 3224533.7
M1 Max32, 6443257
M1 Ultra64, 128864114
M28, 16, 2415520
M2 Pro16, 3240
M2 Max32, 64, 9667
M2 Ultra64, 128, 192134
M38, 16, 2425
M3 Pro18, 3637
M3 Max36, 96 / 48, 64, 12892

Apple's M series seems to be a success so far, doing well against Intel and AMD chips, and it is the four CPU architecture of the Macintosh series:

Motorola 68K - IBM/Motorola PowerPC - Intel x86 - ARM

Also of note is the slow increase in clock speed for the newer chips, much slower than the increase in number of transistors or memory size. Given what CPU chip makers try to do, it seems like they are running into some physical limit.
 
Back
Top Bottom