word-at-a-time: provide generic big-endian zero_bytemask implementation

Whilst architectures may be able to do better than this (which they can,
by simply defining their own macro), this is a generic stab at a
zero_bytemask implementation for the asm-generic, big-endian
word-at-a-time implementation.

On arm64, a clz instruction is used to implement the fls efficiently.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
index 3f21f1b..d3909ef 100644
--- a/include/asm-generic/word-at-a-time.h
+++ b/include/asm-generic/word-at-a-time.h
@@ -49,4 +49,12 @@
 	return (val + c->high_bits) & ~rhs;
 }
 
+#ifndef zero_bytemask
+#ifdef CONFIG_64BIT
+#define zero_bytemask(mask)	(~0ul << fls64(mask))
+#else
+#define zero_bytemask(mask)	(~0ul << fls(mask))
+#endif /* CONFIG_64BIT */
+#endif /* zero_bytemask */
+
 #endif /* _ASM_WORD_AT_A_TIME_H */