ARM: dma-mapping: fix buffer chunk allocation order

IOMMU-aware dma_alloc_attrs() implementation allocates buffers in
power-of-two chunks to improve performance and take advantage of large
page mappings provided by some IOMMU hardware. However current code, due
to a subtle bug, allocated those chunks in the smallest-to-largest
order, what completely killed all the advantages of using larger than
page chunks. If a 4KiB chunk has been mapped as a first chunk, the
consecutive chunks are not aligned correctly to the power-of-two which
match their size and IOMMU drivers were not able to use internal
mappings of size other than the 4KiB (largest common denominator of
alignment and chunk size).

This patch fixes this issue by changing to the correct largest-to-smallest
chunk size allocation sequence.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index d766e425..4044abc 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1067,7 +1067,7 @@
 		return NULL;
 
 	while (count) {
-		int j, order = __ffs(count);
+		int j, order = __fls(count);
 
 		pages[i] = alloc_pages(gfp | __GFP_NOWARN, order);
 		while (!pages[i] && order)