... gave me the ability to install programs to a native ext2/3 partition without the performance hit of a loop device being implemented....
... A bit of performance would also be gained if Extended partitions were not our forced solution for the two most heavily used partitions on the actual drive, (the ext2 drive that programs will run off of and the Swap Drive.). The same can be said about reliability that is said about performance. It’s not optimal as it is – not in the least. ...[div align=\"right\"][a href=\"index.php?act=findpost&pid=107465\"][{POST_SNAPBACK}][/a][/div]
I question the theoretical "performance hit" of these two scenarios, as well as whether their reliability is suspect.
Great questions. (I'd expect nothing less mind you!) With some fairly easy answers with real world arguments--- and some that are more founded in my pet peaves than in huge performance hits. Honesty does us all good. =)
Please bear in mind up front that this partitioning configuration scheme is helpful to me because I will be using extremely demanding X software via X/Qt that demand more system resources than the C3100 can normally give. This heavy usage greatly amplifies the performance hits that I would take as a result of these issues than compared to a casual user or someone who only uses Qtopia based programs written specifically for the Zaurus.
So here we go-
As for this first question this is simple. The main performance hit comes from drive geometry issues and hardware performance from a hard drive's perspective more than the CPU or software, however the first portioin I'll discuss are on the CPU and Software end of things. Data being delivered to drive partitions are routed by priority. (Much like irq's establish priority for devices recieving the CPU's attention and therefore bandwidth.) So Primary partitions recieve primary routing. Extended partitions must take back seat to Primary partitions when routing conflicts occur. And they occur a LOT in IDE implementations. Further, Extended partitions are just that- extended partitions that are extended FROM a primary partition. Actually it would be more accurate to say they are extended THROUGH the Primary partition. For Extended partitions, not only do all routing calls have to be delivered through the Primary partition, but by definition extended partitions sit on a Logical partition also. As you were inferring about Swap Drives possibly being a loopback device (and we'll address that shortly) the extended partitions are somewhat of a similar loopback device system that sits upon a Logical (in this case meaning "doesn't really have a physical address on this side of the interface", remember the Primary partition is providing the actual calls for Logical Partition access and then the interface tells the head/servo where to go), drive.
So to recap - any data that comes/goes to an extended partition must first wait for any Primary partitions to clear the route. Then just in order to get/put the data in the right place the processor on the HDD controller has to calculate from the actual physical geometry what the "advertised geometry" would need to be for the extended partition and then repeat this process for each data pack. It becomes very processor/controller intensive very quickly. It's why Primary partitions are almost always preferred for OS's to boot from. Ditto for Swap Partitions. It's why IDE bogs down so badly compared to SCSI and later interfaces, ESPECIALLY when you also have a Primary and Secondary hard drive on the same Channel. This is because the Primary drive (Master) provides controller services for BOTH the Master and the Slave disk. This is why it's so much faster to copy from a Master drive on one channel to a Master drive on the second channel rather than from a Master to a Slave. They can't transfer data via the Master and Slave simultaneously as the Master controller provides all translation services and can only handle one at a time. This is the same issue as our Primary/Extended drive issue just on a whole other level. A last quick note in response to a question concerning this, the Master/Slave issue can be resolved by Cable Select negotiations on a modern or "current" IDE interface - if everything works together. But for our purposes Microdrives still only adhere to "yesteryear" performance specifications of the ATA-33 and prior implementations that almost always HAD to have Master/Slave configurations.
We can sum all of the above up to be "translation overhead" that is dramatically increased when the most used partitions are also on extended/Logical partitions. You are just introducing two more translation levels as well as the lower priority issues as compared to avoiding all of it by putting that data on Primary partitions in the first place. This "fault" if you will can be rooted in the OS's drivers as well as the initial hardware interface translation on the controller itself.
That's the hard part. The easy part of the answer to this question is much more simple to understand for most people. In most drive geometries Primary Partitions almost always get the "favored location" for data. The two "favored locations†are at the first track and the middle tracks. This is because the physical head of the drive is most often over those two tracks - much more so than anywhere else on a drive. This is why just about every operating system in existence that uses physical hard drives as their operating medium by nature will put it's mostly accessed files on the first or middle tracks IF the user partitions the entire physical drive to one large partition. Things get muddled really fast with multiple partitions are used as the OS has no real way of knowing where the new physical First and Middle tracks are located.
But one thing is ALWAYS true. Extended/Logical partitions are NEVER located on the first track and are usually located PAST the Middle track as well in REAL WORLD APPLICATION simply by fact that they are almost always placed AFTER the primary partitions are physically on the Drive.
Quick second half recap - data that is placed AFTER the Middle physical track takes longer to get to simply because the servo arm/head has to travel farther out of it's normal range to get to the track. Period.
Add the two together and you end up with a worst case scenario for a HDD with a platter geometry. Not only does it take the CPU and software drivers and the CPU on the HDD controller longer to translate HOW to get to the data- but once it does it takes the physical servo arm LONGER to travel to the spot it needs to get it from. You'll immediately notice that one of these problems are contained within the OS and its drivers and the other is completely within the IDE HDD controller itself.
Does it contribute to real world performance hits in physical Hard drives? You betcha. These are known basic issues that have been around for as long as hard drive technology itself. Most end users and even programmers don't know the details of WHY certain partitioning schemes give better performance but it's been ground into the community for ages to simply do things like Put your OS and Swap files in the earliest Primary Partition available. (This is also where the old "but still true" mantra of "put your swap partition on the earliest Primary partition of your least used physical drive" for best server performance comes from.) But feel free to run your own performance tests if you doubt the rationale here, it never hurts to not take someone elses word for granted!
As for extended partitions, I would think that any extra work to access these would be done at mount time. Once they are mounted and the drivers know the addresses of the partitions I would expect there to be zero performance impact. Why do you think differently?
Oops I already covered most of this above. Also the software drivers only know the "advertised" or in our case "LBA" addresses of anything on the hard drive. The cpu on the HDD controller must then translate from LBA into the actual physical drive geometry. Hence the bottleneck and performance hit explained in long form above. Your software drivers of ANY OS that uses a modern IDE HDD is completely blind to the actual drive geometry. Even the CMOS of your desktop computer is blind to it and only knows/uses what the LBA geometry is that is reported from the HDD itself. The CPU on the HDD controller then translates the value that the OS calls for to the real physical value on the HDD. The reason this is done this way is to overcome “would-be†geometry limitations like we used to have back with early IDE, RLL, and MFM drives. It’s also the general basis of the problems and solutions of operating systems being able to recognize drives beyond a certain size/geometry. The actual drive geometry is COMPLETELY known only to the physical electronics of the HDD controller that is mounted on the drive and is never exposed to the Operating System. In this way the OS can use HDD’s with capacities MUCH greater than the System Builders or Operating System engineers ever imagined possible when they released their products. The HDD controller (that is mounted on the drive itself for IDE drives) does all the work for this and in doing so also becomes our performance bottleneck here.
As for loopback, the magic that takes place there is done in software and in RAM: I would expect that there are no additional device interactions that take place. I'd bet that any performance impact would be hard to measure, much less sense on a human level. Again, why do you think differently?
You're exactly right "the magic that takes place there is done in software and in RAM". I couldn't have said it better myself. And because of this both the software and the RAM required to make this loopback device translation require extra CPU cycles and CPU as well as memory bandwidth. By definitition any thing that you add that requires additional software/RAM to handle will add processor overhead and incur a performance hit.
However in certain circumstances you have a valid point. With pure flash memory- specifically with sdcard's when used with Zaurii are affected by this. Because of Sharp's rediculous insistence on MMC compatibility mode implementation of the SD card slots the performance of any particular SDcard may be severely limited because of this bottleneck. For example a SD card that is advertised as 10x speed may be a bit faster than a normal SD card in a Zaurus SD slot, but a 32x speed card will offer no more performance gain than a 10x card because of this enforced bandwidth limitation. Because of this bottleneck you can use a SD card formatted FAT and offer yourself ext2/3 storage availability via a loopback device without hardly any performance hit at all. Testing by OESF members have put the entire performance hit at about 1% of the bandwidth being used in these SD transfers. So it can be a pretty smart move to use a loopback device with a SD card on a Zaurus.
However the CF slots do not have this natural bottleneck limiting performance. Because of this the percentage hit increase when using a high speed CF device with a loopback device floating on its partition CAN be very substantial. The faster the CF device, the worse the performance hit. The bandwidth/CPU overhead performance hit for using a loopback device on a FAT formatted CF device can rise as high as 30-33%! Especially when you are using the CF device to run something large that runs completely off the CF drive partition in ADDITION to a SWAP partition can easily incur this sort of hit. (BTW this is regardless of Primary/Extended placement.) A good example of an application that would fit this scenario would be running X/Qt and a Swap Partition on the same CF drive using a loopback device. OUCH. Almost 100% of your data bandwidth calls have to be routed through this loopback translation and the SOFTWARE and RAM that provide this magic have to steal processor cycles from your CPU for each and every packet. And every time it does there are less CPU cycles and RAM available for running the actual program your using.
You can find most of the information that you would need to look into this further or verify any of the above info right here on the oesf forums. Just do a search for sdcards, and loopback devices etc.- it’s how I actually found out that the performance hit was so low for sdcards in the first place, (much to my surprise at the time.)
I think the same could be said about swap files vs. swap partitions. (In fact, I would not be surprised to find that loopback is used to implement a swap file.) I doubt you'd really experience any difference in performance.
I’ll try to keep this one brief simply because of how well known of a performance issue it is. (Nobody pass out here- I know I’m not brief often.) The difference between using a swap file verses a swap partition is very real and very measureable. The greater the intensity of which the partition/file that resides within the loopback device is used the greater the performance hit. You can search for good info on this very topic right here on these forums as well. The fact that in this case is that very performance hit is being amplified by the Extended partition/translation overhead issue etc.also and only makes the performance hit that much larger.
You do bring up an interesting point about the Swap and loopback issues. I can clarify it for you a bit. The analogy is EXACTLY correct when applied to a swap file as it is simply a swap partition formatting being superimposed over a regular drive partition – and this magic is accomplished of course using the magic of Software and RAM. Sound familiar? Again any layer of translation is always accomplished by an additional layer of software that uses additional RAM (ironically this also works your swap file/swap partition that much harder) and stealing cycles from your Zaurus’s CPU and available bandwidth all the while. A Swap Partition on the other hand is a partition that must be formatted either by the user after it’s creation or by the system during it’s first implementation. In that respect it’s just like any other kind of partition and dislike a loopback device- no translation layer is needed. So you had the right idea you were just applying it over too much of a general area.
And lastly, I see no reason for there to be any kind of reliability hit with either of these. If they work, they work. What would make them any less reliable than other solutions?
The reliability issues here quite frankly are MUCH more difficult for me to explain away because the truth is – they are NO WHERE NEAR as great an issue as the performance issues are. You’ve got me cold on this one I must admit.
The only shred of evidence that I’ll proffer in this respect is that if the primary partition(s) that the Extended partitions are attached to OR the Logical partition that they themselves float on becomes corrupted the Extended partitions are most likely laid to waste as well. This doesn’t happen often, and even when it does with modern IDE technology it’s usually somewhat recoverable.
To sum it up I quickly tossed out the “reliability†card onto the table and equated it to the performance issues and did so without thinking it through and in doing so unjustly represented the facts. Thank you for pointing this out – if we don’t hold ourselves accountable when we’re incorrect, we lack the integrity to be believed when we are!
The one thing about your set-up that would bother me is the two "vestigal" partitions. They do no harm except wasting some space, but it's somewhat ugly to have to keep them. I would expect though that this can be fixed easily if it truly is only scripts that control initialization. OTOH, if Sharp stuck something boneheaded in their proprietary code, you'll probably be living with this for a while.
I agree completely and whole heartedly.
To readers of this post/thread let me take a moment out to turn things around and completely defend the right of Ray’s questioning my performance issues. My line of logic and what he was probably basing his doubts upon are two differing technologies. In his defense all of the CF devices based performance issues that we’ve discussed in this posting would completely flipflop if we were talking about CF Flash Memory cards rather than Microdrives specifically! Almost all of the performance penalties that I’m complaining about are unique to an actual physical hard drive with physical heads, servo’s and spinning platters and their electronic components that control their movements! If we were to be talking about CF Flash memory cards instead then just about 100% of these performance hits that are the subject here would not exist because the controlling circuitry is VERY different and CF flash cards have no major moving parts whatsoever. Keep in mind that Microdrives are just that- they are miniriature HDD’s in every respect- just on a much smaller scale. So don’t be too quick to think he was completely in left field for putting forth his doubts.
If these topics interest you either way, I would encourage you, the reader, to not take either of our words on this topic as gospel truth but rather spend a half hour or so poking around the forums here and the internet in general- you’ll end up with a MUCH better understanding of how hardware and software issues affect your end performance on your Zaurus. Many of these things are things that you the user can easily control on your Zaurus and by using your resources and setup properly can see nice performance gains without any additional monetary expenditure. And THAT is ALWAYS a good thing!
Something else to note is that HOW you use your Zaurus and what you use your Zaurus FOR will impact greatly on whether you personally see any real world performance gains. In my case I will be using X/Qt and some X based programs that demand desktop/server level memory and storage resources in order to perform well. Because of this the things I’ve discussed matter a LOT in how fast my Zaurus will perform under such a load. And since these things ARE something I can control. I’ve chosen to do so as much as possible since things like upgrading my C3100 to a faster CPU and/or more physical RAM are impossible options to me at the time of this writing. However if you are someone who is more apt to use streamlined native Qtopia programs written specifically for your Zaurus you may never even need a Swap File or Swap Partition etc. in the first place! As a matter of fact if you do not normally use enough RAM to warrant the need for one, installing them will only DEGRADE the performance of your Zaurus. So for my particular usage these matters, strategies and precautions make sense. For others they may not!
I also must close by confessing that ANY performance inhibiting thing that exists in my Zaurus that I feel should or could be changed drives me CRAZY until it is fixed. I am an absolute performance nut, overly zealous - a performance junkie I suppose. While every point that I’ve made is true within it’s own context several of these issues are difficult enough to set up that many users would simply not find ANY performance boost justification enough to go to the trouble to tackle them. This is even more so true if the boost would be minimal since their normal Zaurus usage doesn’t push the resources already available beyond normal usage limits.
Anyway, glad to see you got your system working. Good luck with it.
~ray
[div align=\"right\"][a href=\"index.php?act=findpost&pid=107495\"][{POST_SNAPBACK}][/a][/div]
Thank you! I’m very glad too, and as always I wish you and everyone the best with theirs. Please don’t feel that I went to all of this trouble to be confrontational, rather I was excited and overly thrilled that for once someone was asking questions that I had intimate knowledge and the ability to give detailed and hopefully helpful answers to you and other users that may trip over this post! (This doesn’t happen often.)
So thank you for the opportunity it has afforded me to help anyone who may learn from this info. It makes me feel better to have a few tidbits to give back to the community that I take so much from so often.
For anyone who’s interested the majority of the knowledge expressed in this post was from working as a line technician in a robotically driven IDE storage manufacturing facility for several years. I may not know much- but what I do know I know pretty well. =)
Cheers!,
-NeuroShock
EDIT: The first response was edited for clarity when a reader pointed out that part of the blame lay on the OS/driver side of the issue as well as on the HDD controller. This has been corrected. (Thanks for the keen eye and quick heads up.)