The hard truth about PowerPC (and any RISC) emulation

[ARCHIVED] About PearPC, a mostly obsolete PPC Mac emulator for Windows and Linux to run MacOS X 10.1 up to 10.4. Using QEMU is now recommended.

Moderators: Cat_7, Ronald P. Regensburg

Locked
Jones

The hard truth about PowerPC (and any RISC) emulation

Post by Jones »

I was thinking about what kind of performance can we get from PearPC. After a first look at the source code:

PearPC emulates a PPC CPU, the big problem here are the registers, (internal components of a CPU that keep data for operation, they generally had a very low latency time, so you can say that they work at the Speed of in Mhz).

Well, In PowerPC architecture (and RISC's CPUs in general) there are a lot normally about 100+ registers, but in a x86 (standard) architecture the CPU operates with just a few of them, but works with a special Stack to keep data operating smoothly.

This is the reason why a 1 Ghz RISC CPU can beat a 3 Ghz x86 CPU in some applications.

Well in PearPC emulation, all PPC registers are kepp in memory data structures, what is extremely slow (memory in a high end pc now works at 400 mhz, but latency is about 2.5 or 3 cycles so a very "fast" (in Mhz) x86 CPU doesn't help too much.

There is hope!, yes there is the cache memory, the more your CPU has, the better PearPC will work because cache memory can have about 0.3 times registers speed.

But cache memory is not the salvation :? .
modern x86 CPUs, has very very long pipelines in order to archive high Mhz rates, THIS IS VERY BAD. The longer the pipeline, the longer the lost of CPU cycles every time the program makes a Jump to another place, and given the architecture of the PPC emulated CPU, this is a very frequent case, so a many 10+ Ghz machine is not the answer.

There are three ways to get a better performance:
1. Complex algorithm and compilation optimization, this will requiere a lot of research and its very complex to archive.
2. Get a Big cache CPU and recompile PearPC with CPU's cache optimization flags (you will need special compilers to do this, Intel an AMD specific). Very good CPUs here is Xeon and Opteron. A normal user is in good shape with a Pentium M notebook or a Athlon FX, maybe a P4 EE (a pseudo Xeon CPU)
3. Get a short pipeline CPU, i only see a Pentium M here (anyone?)

Any kind of Celeron, Duron or Old Athlon is not recomended. Also, i think that a P4 CPU (My case) is not good either, but will wok because the High Mhz speed., i think that a 1.4, 1.6 P4 is Slower than a P3 1 Ghz or Athlon CPU for PearPC.

Well, hoopfully the authors of this proyect had enough resources and time to improve and optimize the emulator, its certainly a Big task, maybe a company could help (¿VMware?), but it is risky, Apple will try to shut this thing down sooner or later. In some place i readed about a PCI cards that contain a PowerPC and memory, this is an excelent way to get a Mac inside a PC (and cheap enough, maybe US$ 350) (Apple did this in the opposite way), buy people working in this may probably never see light, because Apple will not allow (this could simply kill Apple, think about having a US$ 1000 Mac-PC).

Well only time will tell....
Guest

Post by Guest »

PearPC has a register allocator and generates quite good code. The registers are not kept in memory but dynamically mapped to x86 registers. Take a look at the code and at http://pearpc.sourceforge.net/wiki/inde ... p_CPU_chat
Le petit prince
Tinkerer
Posts: 47
Joined: Wed May 12, 2004 9:03 pm

Re: The hard truth about PowerPC (and any RISC) emulation

Post by Le petit prince »

Jones wrote:PearPC emulates a PPC CPU, the big problem here are the registers, (internal components of a CPU that keep data for operation, they generally had a very low latency time, so you can say that they work at the Speed of in Mhz).

Well, In PowerPC architecture (and RISC's CPUs in general) there are a lot normally about 100+ registers, but in a x86 (standard) architecture the CPU operates with just a few of them, but works with a special Stack to keep data operating smoothly. [...]
:?: On what information do you base this? To me that seems to be clearly wrong. - This is a short excerpt from Sebastian Biallas' CPU chat:

Seppel* ok: following problem do I currently have
Seppel* 8 regs vs. 32 regs on PowerPC is not a real problem
Seppel* (in contrast to the rumors you alway hear)

Seppel* the real problems currenly are:
Seppel* the MMU
Seppel* that is the part the translates virtual addresses into real addresses
Seppel* let me elaborate on this
Seppel* there's a unit in the powerpc that translates all memory acceses into address of the real memory
Seppel* (x86 of course has this too, and on x86 its even far more compicated)
Seppel* but the problem is,
Seppel* on every memory access, I have to call a functions which translates the address
Seppel* the address translation is important, because addresses might not be in memory
Seppel* or address might be protected
Seppel* (ever heard of "general protections fault"? now you know, the application wanted to access memory it doesnt own)
Seppel* so I have to call this function on every memory access (at least currently)
Seppel* and THAT is one of the real things, that slow down PearPC
Seppel* not some "too few registers" bullshit (sorry)



Regards,

Le petit prince
Jones

Registers "are" a problem.

Post by Jones »

You must load registers before you can operate with them, RISC's compiled code is optimized to work in this way so if a segment of RISC code must operate with, lets say, 32 registers continually, a x86 code translation must save, then reload each operand, so MEMORY IS A BOOTLENECK always.

Maybe i overlooked the memory allocation problem, who is addressed in a register that you have to keep in mememory too. Memory will always be a problem, because a register inside the CPU is about 100 times faster (us
ually more) Cache memory is very important here. The drawback is that PowerPC CPUs has a good amount of cache memory too, so you simply cannot have all registers, frequent data and PPC cache data in a "small" cache x86 CPU. Any time you get more cache, PearPC will work better.

As a proof, i saw another post about OS X installation time in differents PCs, no surprise that a Pentium M-based (1 MB cache, short pipeline) notebook took the lead (only 36 minutes) beating a 3.2 Ghz P4 (2+ hours). So i "still" think i have a point here and "experimental" tests prove it.
PGHammer

PearPC speed on x86

Post by PGHammer »

It's not just cache memory that affects PearPC speed, system memory (both speed and type) also impact PearPC performance (it may be the second largest impact after cache memory).

It was mentioned earlier that Pentium-M processors (notebook processors with large on-die caches) typically run PearPC best. Has anyone run PearPC on current P4E (Prescott) processor-based systems? Like the original P4EE, most P4Es have the 1 MB cache (and the HyperThreading support the M lacks). Further, there have been few reports from Athlon64 or Opteron owners (let alone XEON owners running Linux).

The jury is still out (more data is needed), so let's *not* jump to comclusions.
grayfox
Tinkerer
Posts: 49
Joined: Thu May 13, 2004 2:42 am

Post by grayfox »

I wont touch the presscott theres hugh design flaws in it.
PGHammer

Re: The hard truth about PowerPC (and any RISC) emulation

Post by PGHammer »

Jones wrote:I was thinking about what kind of performance can we get from PearPC. After a first look at the source code:

PearPC emulates a PPC CPU, the big problem here are the registers, (internal components of a CPU that keep data for operation, they generally had a very low latency time, so you can say that they work at the Speed of in Mhz).

Well, In PowerPC architecture (and RISC's CPUs in general) there are a lot normally about 100+ registers, but in a x86 (standard) architecture the CPU operates with just a few of them, but works with a special Stack to keep data operating smoothly.

This is the reason why a 1 Ghz RISC CPU can beat a 3 Ghz x86 CPU in some applications.

Well in PearPC emulation, all PPC registers are kepp in memory data structures, what is extremely slow (memory in a high end pc now works at 400 mhz, but latency is about 2.5 or 3 cycles so a very "fast" (in Mhz) x86 CPU doesn't help too much.

There is hope!, yes there is the cache memory, the more your CPU has, the better PearPC will work because cache memory can have about 0.3 times registers speed.

But cache memory is not the salvation :? .
modern x86 CPUs, has very very long pipelines in order to archive high Mhz rates, THIS IS VERY BAD. The longer the pipeline, the longer the lost of CPU cycles every time the program makes a Jump to another place, and given the architecture of the PPC emulated CPU, this is a very frequent case, so a many 10+ Ghz machine is not the answer.

There are three ways to get a better performance:
1. Complex algorithm and compilation optimization, this will requiere a lot of research and its very complex to archive.
2. Get a Big cache CPU and recompile PearPC with CPU's cache optimization flags (you will need special compilers to do this, Intel an AMD specific). Very good CPUs here is Xeon and Opteron. A normal user is in good shape with a Pentium M notebook or a Athlon FX, maybe a P4 EE (a pseudo Xeon CPU)
3. Get a short pipeline CPU, i only see a Pentium M here (anyone?)

Any kind of Celeron, Duron or Old Athlon is not recomended. Also, i think that a P4 CPU (My case) is not good either, but will wok because the High Mhz speed., i think that a 1.4, 1.6 P4 is Slower than a P3 1 Ghz or Athlon CPU for PearPC.

Well, hoopfully the authors of this proyect had enough resources and time to improve and optimize the emulator, its certainly a Big task, maybe a company could help (¿VMware?), but it is risky, Apple will try to shut this thing down sooner or later. In some place i readed about a PCI cards that contain a PowerPC and memory, this is an excelent way to get a Mac inside a PC (and cheap enough, maybe US$ 350) (Apple did this in the opposite way), buy people working in this may probably never see light, because Apple will not allow (this could simply kill Apple, think about having a US$ 1000 Mac-PC).

Well only time will tell....
Actually, it depends on the P4 (and the configuration of the system itself).

There are *six* different types of Pentium 4 processors for desktops:

1. The original S423 P4 (Williamette) 256K cache, but a pipeline shorter than that of the Pentium-M.

2. The first volume S478 P4 (Northwood-A) doubles the cache from Williamette and adds support for the 133 MHz quad-pumped frontside bus.

3. The extremely prevalent Northwood-B series dropped support for the 100 MHz quad-pumped FSB from Northwood-A. A single example (3.06B) adds HyperThreading support.

4. Today's volume processor, the Northwood-C, uses a 200 MHz quad-pumped FSB. All models include HyperThreading support.

5. The unique Northwood-EE (the original Pentium 4 Extreme Edition) takes the Northwood-C's core and doubles the on-die cache to 1 MB. It only ships in one speed (3.2EE).

6. Lastly, there's Prescott (P4E), with even longer registers than Northwood-C.
All currently shipping versions use the 200 MHz quad-pumped FSB of Northwood-C and include HyperThreading support; most have 1 MB of on-die cache. There is also the unique 3.4E-EE, with an even larger 2 MB on-die cache mated to the Prescott core.

More data is needed.

Also, why would Apple try to shut it down?

It is of no real threat to Apple's hardware market (performance is still not up to that of a real Mac, and isn't meant to be); if anything, it would result in an *increase* of OS X software sales simply because Apple and other OS X software developers would gain the not-insignificant Linux market (and the hyperdominant Win32 market) as sales targets.

OS X as a Win32 or Linux application is no more threatening to Mac hardware sales than VirtualPC for Mac has been to PC sales.

Kill it? If anything, Apple should *bankroll* it.

Gaining additional revenues (with very little outlay) is nothing to be sneezed at.
CaptainValor
Forum All-Star
Posts: 587
Joined: Mon May 17, 2004 11:57 pm

Post by CaptainValor »

Exactly. And think of the PC users who have been waiting for something like this for a very LONG time (eg. the AquaXP community). I'm sure many of them who do not own a Mac will pay for OS X simply to run it on PPC once the performance gets better. And once we figure out how to get larger capacity drives, you could potentially use the emulated Mac for more than just experimentation.

And speaking of performance, this discussion about PowerPC emulation is fascinating. It would be great to get Sabastian himself to comment here.
Marc
Master Emulator
Posts: 357
Joined: Wed Aug 20, 2003 2:14 pm

Post by Marc »

Come on! How many people are actually going to go out and spend their hardearned cash os OS X to run it on a PPC emulator on an x86 system, no matter how good the perfromance is? I can't see it being a significant increase in Apple's revenue from this. Most people will use a piraet version. Sad but true. The only people who will play it above board are the people who need a true cross platform solution for business or productivity.

Yes, I like OS X as an OS and yes I use PearPC. I didn't pay the 100 pounds for OS X, I borrowed a copy from a friend to see what it was like. I wouldn't even consider purchasing a lisence at that price to use it in an emu. The only way I would consider paying REAL money for OS X would be if I could run it natively on my Pentium 4 box as a real alternative to windows.
Temporary
Student Driver
Posts: 11
Joined: Mon May 17, 2004 9:11 am

Post by Temporary »

Come on! How many people are actually going to go out and spend their hardearned cash os OS X to run it on a PPC emulator on an x86 system, no matter how good the perfromance is? I can't see it being a significant increase in Apple's revenue from this. Most people will use a piraet version. Sad but true. The only people who will play it above board are the people who need a true cross platform solution for business or productivity.
Well, you could say the same thing about Mac users as well: why will they bother to buy latest OSX? just download it and install it on their G4 or G3.

The truth is that although piracy is common among PC and Mac users, there are still some dissent people out there, who make the majority of the market, who actually buy legal copies.
Ofcourse you can't expect any significant increase in Apple's revenue from this at this time, as the emulator is in very early stages and indeed as slow as a snail, but once the speed and performance will increase, and x86 cpus will continue to go up in the ghz ladder while widening the gap with ppc cpus, to the point where one could emulate, say, a 800MHz G4 on an x86 PC, we will see an increase in OSX sales, when people will go out and buy it to run it on their pc (maybe someone will make a Linux distrib which will emulate a G4, making it affordable to buy osx, as Linux itself is free).

If GNUstep will continue to improve itself, we might see the killer app that will make Apple go x86.
User avatar
Yukon Kid
Mac Mechanic
Posts: 169
Joined: Thu May 20, 2004 6:27 pm

Post by Yukon Kid »

I am on my second Mac, and like those in the Win world I bought the newest Os's as they came out. So I have several disks sitting in the drawer that I simply don't use any longer.
Pirate a copy??? no way. but I would gladly sell the disks with the versions I no longer wish to install on my computer.
how may people do you know that have the win 3.1 or 95 or 98 disks and they are now running 2000 or xp.
I suggest heading to ebay or the like and see if you can get jaguar cheap.
How many programs have you paid $$$$$ for and don't really use????
So drop $$$ for a OS that works great and if you don't want it, well resell it on ebay.
User avatar
ClockWise
Site Admin
Posts: 4397
Joined: Mon May 20, 2002 4:37 am
Location: Uiwang

Post by ClockWise »

If you want to unload some old copies of the OS, I can sticky a "MacOS for sale" thread.
Guest

Post by Guest »

What makes you think the speed will *ever* increase to any degree suitable to run OSX usably? The programmer said himself his *hope* (not even an expectation or high probability) that he can get 1/10 the speed of a real PPC. That, sounds, pretty bad... Doesn't sound usable to me.. More a novelty than anything else really.

A hardware PCI card solution is different tho.... as far as piracy... I think if Apple thought there was a market to do this they would've done it long ago with the 68k Macs.. I remember 10 years ago everyone harping on the same concept for that.. never happened. wonder why?
Marc
Master Emulator
Posts: 357
Joined: Wed Aug 20, 2003 2:14 pm

Post by Marc »

Temporary wrote: Well, you could say the same thing about Mac users as well: why will they bother to buy latest OSX? just download it and install it on their G4 or G3.
Apple make most of their money from hardware
Marc
Master Emulator
Posts: 357
Joined: Wed Aug 20, 2003 2:14 pm

Post by Marc »

Yukon Kid wrote:I am on my second Mac, and like those in the Win world I bought the newest Os's as they came out. So I have several disks sitting in the drawer that I simply don't use any longer.
Pirate a copy??? no way. but I would gladly sell the disks with the versions I no longer wish to install on my computer.
how may people do you know that have the win 3.1 or 95 or 98 disks and they are now running 2000 or xp.
I suggest heading to ebay or the like and see if you can get jaguar cheap.
How many programs have you paid $$$$$ for and don't really use????
So drop $$$ for a OS that works great and if you don't want it, well resell it on ebay.
OS X isn't very cheap on eBay. All in all it is an expensive outlay for something that isn't fully useable. If it was something like VirtualMac (in the mould of Virtual PC) it could be considered as a cross-platform solution IF one really needed one. As such, I think it's fun to mess around with OS X on x86, just as it is to mess around with different OSes in Virtual PC. But the fact remains, just because it is used for emulation doesn't mean an OS X license is going to be any cheaper.

All in all, is it worth buying OS X to use in PearPC? No way, no how. Don't get me wrong PearPC is an astounding achievement, but it isn't enough to make me want to spend a fortune on OS X.
scriptfactory

Post by scriptfactory »

Guest wrote:What makes you think the speed will *ever* increase to any degree suitable to run OSX usably? The programmer said himself his *hope* (not even an expectation or high probability) that he can get 1/10 the speed of a real PPC. That, sounds, pretty bad... Doesn't sound usable to me.. More a novelty than anything else really.
I think you misread the statement. It's 1/10 the speed of the _host_, meaning the computer the emulator is running on. So for every 10 x86 instructions it executes one PPC instruction. Sooo, for a 3ghz processor he is trying to get (very roughly) G3 300mhz performance, which isn't all that bad. PearPC actually runs quite well on my Barton 2800+ right now, I really have faith that the core/drivers will speed up considerably in the next coming months.
Mr. T

Post by Mr. T »

Getting back to the original topic, I think the biggest problem will be emulating the opcodes in as few instructions as possible. This is true of any emulator. I don't know specific details about PearPC, but I'm assuming that most, if not all, the code is still in some high-level language (probably C++). An emulator is definately one of those occasions where performance can be significantly improved by the presence of assembly, especially for the most commonly used operations.

The other thing is that any emulator will tend to execute an extremely high number of unconditional branch instructions, which will absolutely wreak havoc on the pipeline. This will place an unusually high amount of stress on the entire memory architecture, but in particular the bus. It also means that a processor with a significantly shorter pipeline WILL run PearPC significantly more efficiently.

Finally, I buy the "register" dilema for reasons already mentioned.

Oh, and one more thing. Someone mentioned that the author expected to get 1/10th the performance of the host machine. I think this is a pretty optimistic goal considering that most non-commercial emulators (console and others) don't achieve this. On the other hand, commercial emulators like Virtual PC for Mac, get significantly better than this, simply because indi projects don't have the same resources as commercial ones. Also, I don't think it's possible to get the 1/10th without a significant amount of assembly, which will make the emulator very-much CPU and OS dependant. Correct me if I'm wrong, but doesn't this violate one of the original goals of PearPC?
CaptainValor
Forum All-Star
Posts: 587
Joined: Mon May 17, 2004 11:57 pm

Post by CaptainValor »

I agree that assembly would make PPC run much faster. But unless I'm mistaken, assembly doesn't have to be extremely hardware-specific. Already the nightly builds are being compiled on specific processor families. I think a similar thing could be done with assembly versions of PPC.
robojam
Forum All-Star
Posts: 779
Joined: Thu Apr 17, 2003 10:52 pm
Location: Charlotte, NC. USA

Post by robojam »

Assembly language IS very hardware specific in that it is designed for a particular processor family, so for an x86 processor, x86 assembler must be used.

The problem lies in convincing the operating system that it is interfacing with the correct hardware, but this has two facets to it. Firstly there is the processor problem, which at its most basic is register mapping, but there are other issues involved too. Secondly there is the wide range of hardware available to x86 OS users compared to the much more standardized hardware available to Mac OS users.

Assembler would be of most use in emulating the processor, but it would involve too many lines of code for it to be practical in dealing with the hardware issues. It is very easy to get hold of information on the two processor families, and so writing the assembly code is more laborious than difficult; but I would guess that most of the problems lie in patching the Mac OS to non-processor hardware in x86 systems.

At the end of the day, none of us can really do better than try to guess what the developers are doing, and rather than make a big issue of this, we should applaud them for making the progress that they have made. If we could achieve 1/10 of the speed of a real PPC, then that would be pretty amazing if you ran it on a very fast x86 system.

Let's give them time - I'm sure there's better to come.
Once you've made something idiot proof, they go and invent a better idiot!
nikoniko
Student Driver
Posts: 12
Joined: Sun May 23, 2004 7:48 pm

Post by nikoniko »

Marc wrote: OS X isn't very cheap on eBay. [snip] Don't get me wrong PearPC is an astounding achievement, but it isn't enough to make me want to spend a fortune on OS X.
How do you define a fortune? Jaguar regularly goes for $30 or less. (eg. http://cgi.ebay.com/ws/eBayISAPI.dll?Vi ... 29674&rd=1) Dinner and a movie can cost that much or more. Some people here would rather forgo the dinner and the movie and sit at home tinkering with an interesting project.

I've even seen it go as low as $5.50 (http://cgi.ebay.com/ws/eBayISAPI.dll?Vi ... 33303&rd=1 ). Sellers like Macliquidators and others have so flooded Ebay with OS X auctions that it's easy to find a GREAT price on OS X. Perfect for those who wish to tinker with PearPC.

Cheers,
nikoniko
Locked