OESF Portal | OESF Forum | OESF Wiki | LinuxPDA | #planetgemini chat on matrix.org | #gemini-pda chat on Freenode | #zaurus and #alarmz chat on Freenode | ELSI (coming soon) | Ibiblio

IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Software using many cores
Varti
post Jan 2 2018, 01:04 AM
Post #1





Group: Admin
Posts: 815
Joined: 30-April 08
From: Italy
Member No.: 21,713



The Helio X27 CPU is a 10-core unit; I was wondering if any of you knows of any software that might make use of most, or all the cores. I only know of gcc and Blender which use all the available cores, while I have read on this article that games (although mostly commercial ones, hence for x86 CPUs only) usually don't use more than 4 cores: https://www.phoronix.com/scan.php?page=arti...aling&num=1 Any more examples?

Varti
Go to the top of the page
 
+Quote Post
bogomips
post Jan 4 2018, 06:23 PM
Post #2





Group: Members
Posts: 25
Joined: 3-January 18
Member No.: 815,787



mmm, good question...

The cores listed for X27 are not all equal, as you might expect in an Intel i7 or other desktop processor...

MediaTek says they have 3 clusters of cores: link

QUOTE
CPU Cluster 1: ARM-A72 @ 2.6GHz
CPU Cluster 2: ARM-A53 @ 2.0GHz
CPU Cluster 3: ARM-A53 @ 1.6GHz
Cores: Deca (10)
CPU Bit: 64-bit
Heterogeneous Multi-Processing: Yes


So so X27 contains some A72s, some A53s clocked high and low.

The idea is that the cores can be switched on/off as demand requires, and as the OS thinks benefit to be had. Obviously heat and energy usage is the cost for having all cores running.

This is the architecture of the X27:


I think that means the slowest A53s are running all the time, and the faster A53s and fastest A72s only wake up when the OS wants to use them to run a thread.

I application design, the benefit of multi core CPUs are only actualized if the application is specifically designed to make use of more than 1 core.
Things like blender has this design to make it scale better and run faster on more expensive hardware.

Besides the speed of the cores, there is a difference in capability. The lower ones have less cache, shorter pipelines, lack out of order execution (and I think can only interpret a lesser set of instructions).
See here.

Early versions of Android could not make use of all the cores at once - they had to switch between a set of cores...

The Heterogeneous Multi-Processing means that the OS can use all the cores at the same time, if it chooses. That is good.
Earlier SOCs could only use one cluster at a time.
Even after Heterogeneous Multi-Processing became a think, I think Samsung had an issue in the S4 or something where the OS could not support it but the hardware could...

Here are some good diagrams to illustrate it.


So - ymmv wrt using all cores. Even if your application supports it, and the OS allows it, you will not get 10x performance over running on 1 core... not all the cores are equal.


Another thing to ponder - will the Linux process scheduler in the kernel allow Heterogeneous Multi-Processing?
Android these days obviously supports this...

This article from 2012 seems to suggest no such scheduler existed at the the time: https://lwn.net/Articles/481055/

This question from 2014 seems to suggest that it was available in custom kernels: https://stackoverflow.com/questions/2549821...d-in-linux-kern

Matching that referenced kernel source file to the current one, shows no mention of "HMP": https://github.com/torvalds/linux/blob/mast...el/sched/core.c

It may have been refactored out of that source file in the meantime, or it might mean - you need a custom kernel to support HMP??


Also, I wonder if thermal throttling may force the CPU to only use slower cores...

More info here: http://www.sisoftware.eu/2015/06/22/arm-bi...e-lucky-number/
Go to the top of the page
 
+Quote Post
Varti
post Jan 4 2018, 06:51 PM
Post #3





Group: Admin
Posts: 815
Joined: 30-April 08
From: Italy
Member No.: 21,713



Thanks for the detailed explanation! There are many variables at stake it seems. Might be interesting to check if it will be possible to choose different cores management models under Debian, and check the results with benchmarks supporting multiple cores (Phoronix Test Suite perhaps?), also evaluate if custom HMP kernels could be compiled.

Varti
Go to the top of the page
 
+Quote Post
Murple2
post Feb 14 2018, 03:25 AM
Post #4





Group: Members
Posts: 94
Joined: 5-January 18
Member No.: 815,856



They way to best utilise as many cores as possible is to run multiple processes. This is how GCC does it. When you pass -j=x makeflag, it spawns x processes and these will be shared among the cores. Ive always read that you set it to number of cores + 1. So if you have 4 cores you run 5 processes and this utilises most of your CPU power. Lots of applications do this, and many allow you to specify how many processes to use. E.g. in mencoder you can pass 'threads=x' to the video encoder and it will run on multiple cores.

At a deeper level than this the kernel scheduler comes into play but i don't think we need to worry about this so much - see my reply on the swappiness topic. Although I'm all for benchmarks and really having a look at the nitty gritty performance when we get some devices in our hands
Go to the top of the page
 
+Quote Post
Murple2
post Feb 14 2018, 03:31 AM
Post #5





Group: Members
Posts: 94
Joined: 5-January 18
Member No.: 815,856



QUOTE(bogomips @ Jan 5 2018, 02:23 AM) *
Another thing to ponder - will the Linux process scheduler in the kernel allow Heterogeneous Multi-Processing?


To summarise my answer from another topic - yes, it will. There is at least one big.LITTLE MP scheduler.
Go to the top of the page
 
+Quote Post
Murple2
post Feb 14 2018, 03:47 AM
Post #6





Group: Members
Posts: 94
Joined: 5-January 18
Member No.: 815,856



https://www.google.co.uk/url?sa=t&sourc...F149BpewabD362w

This is some good background on the mediatek socs specifically. I think the issues we will come up against will be about higher power consumption under Linux. I predict performance won't be a problem. Time will tell
Go to the top of the page
 
+Quote Post
speculatrix
post Feb 17 2018, 12:09 PM
Post #7





Group: Admin
Posts: 3,607
Joined: 29-July 04
From: Cambridge, England
Member No.: 4,149



I think ffmpeg is multithreaded.
There are some compression/decompression tools like pigz which use more cores.
Many graphical programs benefit from more threads.
However, I think at high cpu usage, memory bandwidth will be our limiting factor.
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 15th July 2018 - 01:13 PM