Blackfin: optimize strncpy a bit

Add a little strncpy optimization which can easily cut boot time by 20%.

When the kernel is booting with initramfs, it builds up the filesystem
from a cpio archive by calling strncpy_from_user() via fs/namei.c's
do_getname() on every file in the archive (which can be lots) with a
length of PATH_MAX (1024).  This causes the dest of the strncpy to be
padded with many NUL bytes.

This optimization mostly causes these NUL bytes to be padded with a call
to memset() which is already optimized for filling memory quickly, but
the hardware loop helps a little bit as well.

Boot time measured with 'loglevel=0' so UART speed doesn't get in the way.

Signed-off-by: Robin Getz <robin.getz@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
diff --git a/arch/blackfin/lib/memset.S b/arch/blackfin/lib/memset.S
index c30d99b1..eab1bef 100644
--- a/arch/blackfin/lib/memset.S
+++ b/arch/blackfin/lib/memset.S
@@ -20,6 +20,7 @@
  * R1 = filler byte
  * R2 = count
  * Favours word aligned data.
+ * The strncpy assumes that I0 and I1 are not used in this function
  */
 
 ENTRY(_memset)