Merge tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull fork cleanups from Christian Brauner:
 "This is cleanup series from when we reworked a chunk of the process
  creation paths in the kernel and switched to struct
  {kernel_}clone_args.

  High-level this does two main things:

   - Remove the double export of both do_fork() and _do_fork() where
     do_fork() used the incosistent legacy clone calling convention.

     Now we only export _do_fork() which is based on struct
     kernel_clone_args.

   - Remove the copy_thread_tls()/copy_thread() split making the
     architecture specific HAVE_COYP_THREAD_TLS config option obsolete.

  This switches all remaining architectures to select
  HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling
  convention. The current split makes the process creation codepaths
  more convoluted than they need to be. Each architecture has their own
  copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it
  has a copy_thread_tls() function.

  The split is not needed anymore nowadays, all architectures support
  CLONE_SETTLS but quite a few of them never bothered to select
  HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread()
  and use the old calling convention. Removing this split cleans up the
  process creation codepaths and paves the way for implementing clone3()
  on such architectures since it requires the copy_thread_tls() calling
  convention.

  After having made each architectures support copy_thread_tls() this
  series simply renames that function back to copy_thread(). It also
  switches all architectures that call do_fork() directly over to
  _do_fork() and the struct kernel_clone_args calling convention. This
  is a corollary of switching the architectures that did not yet support
  it over to copy_thread_tls() since do_fork() is conditional on not
  supporting copy_thread_tls() (Mostly because it lacks a separate
  argument for tls which is trivial to fix but there's no need for this
  function to exist.).

  The do_fork() removal is in itself already useful as it allows to to
  remove the export of both do_fork() and _do_fork() we currently have
  in favor of only _do_fork(). This has already been discussed back when
  we added clone3(). The legacy clone() calling convention is - as is
  probably well-known - somewhat odd:

    #
    # ABI hall of shame
    #
    config CLONE_BACKWARDS
    config CLONE_BACKWARDS2
    config CLONE_BACKWARDS3

  that is aggravated by the fact that some architectures such as sparc
  follow the CLONE_BACKWARDSx calling convention but don't really select
  the corresponding config option since they call do_fork() directly.

  So do_fork() enforces a somewhat arbitrary calling convention in the
  first place that doesn't really help the individual architectures that
  deviate from it. They can thus simply be switched to _do_fork()
  enforcing a single calling convention. (I really hope that any new
  architectures will __not__ try to implement their own calling
  conventions...)

  Most architectures already have made a similar switch (m68k comes to
  mind).

  Overall this removes more code than it adds even with a good portion
  of added comments. It simplifies a chunk of arch specific assembly
  either by moving the code into C or by simply rewriting the assembly.

  Architectures that have been touched in non-trivial ways have all been
  actually boot and stress tested: sparc and ia64 have been tested with
  Debian 9 images. They are the two architectures which have been
  touched the most. All non-trivial changes to architectures have seen
  acks from the relevant maintainers. nios2 with a custom built
  buildroot image. h8300 I couldn't get something bootable to test on
  but the changes have been fairly automatic and I'm sure we'll hear
  people yell if I broke something there.

  All other architectures that have been touched in trivial ways have
  been compile tested for each single patch of the series via git rebase
  -x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested
  even though they have just been trivially touched (removal of the
  HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are
  basically "core architectures" and since it is trivial to get your
  hands on a useable image"

* tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  arch: rename copy_thread_tls() back to copy_thread()
  arch: remove HAVE_COPY_THREAD_TLS
  unicore: switch to copy_thread_tls()
  sh: switch to copy_thread_tls()
  nds32: switch to copy_thread_tls()
  microblaze: switch to copy_thread_tls()
  hexagon: switch to copy_thread_tls()
  c6x: switch to copy_thread_tls()
  alpha: switch to copy_thread_tls()
  fork: remove do_fork()
  h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  sparc: unconditionally enable HAVE_COPY_THREAD_TLS
  sparc: share process creation helpers between sparc and sparc64
  sparc64: enable HAVE_COPY_THREAD_TLS
  fork: fold legacy_clone_args_valid() into _do_fork()
diff --git a/arch/Kconfig b/arch/Kconfig
index 8cc35dc..943aac2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -754,13 +754,6 @@
 	depends on MMU
 	select ARCH_HAS_ELF_RANDOMIZE
 
-config HAVE_COPY_THREAD_TLS
-	bool
-	help
-	  Architecture provides copy_thread_tls to accept tls argument via
-	  normal C parameter passing, rather than extracting the syscall
-	  argument from pt_regs.
-
 config HAVE_STACK_VALIDATION
 	bool
 	help
diff --git a/arch/alpha/kernel/process.c b/arch/alpha/kernel/process.c
index b45f0b0..7462a79 100644
--- a/arch/alpha/kernel/process.c
+++ b/arch/alpha/kernel/process.c
@@ -233,10 +233,9 @@ release_thread(struct task_struct *dead_task)
 /*
  * Copy architecture-specific thread state
  */
-int
-copy_thread(unsigned long clone_flags, unsigned long usp,
-	    unsigned long kthread_arg,
-	    struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+		unsigned long kthread_arg, struct task_struct *p,
+		unsigned long tls)
 {
 	extern void ret_from_fork(void);
 	extern void ret_from_kernel_thread(void);
@@ -267,7 +266,7 @@ copy_thread(unsigned long clone_flags, unsigned long usp,
 	   required for proper operation in the case of a threaded
 	   application calling fork.  */
 	if (clone_flags & CLONE_SETTLS)
-		childti->pcb.unique = regs->r20;
+		childti->pcb.unique = tls;
 	else
 		regs->r20 = 0;	/* OSF/1 has some strange fork() semantics.  */
 	childti->pcb.usp = usp ?: rdusp();
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 197896c..ba00c4e 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -29,7 +29,6 @@
 	select GENERIC_SMP_IDLE_THREAD
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_STACKOVERFLOW
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_FUTEX_CMPXCHG if FUTEX
diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
index 8c8e517..105420c 100644
--- a/arch/arc/kernel/process.c
+++ b/arch/arc/kernel/process.c
@@ -173,8 +173,9 @@ asmlinkage void ret_from_fork(void);
  * |    user_r25    |
  * ------------------  <===== END of PAGE
  */
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp,
-	unsigned long kthread_arg, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+		unsigned long kthread_arg, struct task_struct *p,
+		unsigned long tls)
 {
 	struct pt_regs *c_regs;        /* child's pt_regs */
 	unsigned long *childksp;       /* to unwind out of __switch_to() */
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index d54c413..c51eaeb 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -72,7 +72,6 @@
 	select HAVE_ARM_SMCCC if CPU_V7
 	select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
 	select HAVE_CONTEXT_TRACKING
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
 	select HAVE_DMA_CONTIGUOUS if MMU
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 58eaa1f..3395be1 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -225,9 +225,8 @@ void release_thread(struct task_struct *dead_task)
 
 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
 
-int
-copy_thread_tls(unsigned long clone_flags, unsigned long stack_start,
-	    unsigned long stk_sz, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long stack_start,
+		unsigned long stk_sz, struct task_struct *p, unsigned long tls)
 {
 	struct thread_info *thread = task_thread_info(p);
 	struct pt_regs *childregs = task_pt_regs(p);
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 73aee72..e11b4ea 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -149,7 +149,6 @@
 	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CMPXCHG_LOCAL
 	select HAVE_CONTEXT_TRACKING
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_BUGVERBOSE
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 6089638..84ec630 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -375,7 +375,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 
 asmlinkage void ret_from_fork(void) asm("ret_from_fork");
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long stack_start,
+int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 		unsigned long stk_sz, struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs = task_pt_regs(p);
diff --git a/arch/c6x/kernel/process.c b/arch/c6x/kernel/process.c
index cb9c8b6..9f4fd6a 100644
--- a/arch/c6x/kernel/process.c
+++ b/arch/c6x/kernel/process.c
@@ -105,8 +105,8 @@ void start_thread(struct pt_regs *regs, unsigned int pc, unsigned long usp)
  * Copy a new thread context in its stack.
  */
 int copy_thread(unsigned long clone_flags, unsigned long usp,
-		unsigned long ustk_size,
-		struct task_struct *p)
+		unsigned long ustk_size, struct task_struct *p,
+		unsigned long tls)
 {
 	struct pt_regs *childregs;
 
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index bd31ab1..902f114 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -38,7 +38,6 @@
 	select GX6605S_TIMER if CPU_CK610
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_AUDITSYSCALL
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_BUGVERBOSE
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c
index 8b3fad0..28cfeaa 100644
--- a/arch/csky/kernel/process.c
+++ b/arch/csky/kernel/process.c
@@ -40,7 +40,7 @@ unsigned long thread_saved_pc(struct task_struct *tsk)
 	return sw->r15;
 }
 
-int copy_thread_tls(unsigned long clone_flags,
+int copy_thread(unsigned long clone_flags,
 		unsigned long usp,
 		unsigned long kthread_arg,
 		struct task_struct *p,
diff --git a/arch/h8300/kernel/process.c b/arch/h8300/kernel/process.c
index 0ef55e3..83ce3ca 100644
--- a/arch/h8300/kernel/process.c
+++ b/arch/h8300/kernel/process.c
@@ -105,9 +105,8 @@ void flush_thread(void)
 {
 }
 
-int copy_thread(unsigned long clone_flags,
-		unsigned long usp, unsigned long topstk,
-		struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+		unsigned long topstk, struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs;
 
@@ -159,11 +158,19 @@ asmlinkage int sys_clone(unsigned long __user *args)
 	unsigned long  newsp;
 	uintptr_t parent_tidptr;
 	uintptr_t child_tidptr;
+	struct kernel_clone_args kargs = {};
 
 	get_user(clone_flags, &args[0]);
 	get_user(newsp, &args[1]);
 	get_user(parent_tidptr, &args[2]);
 	get_user(child_tidptr, &args[3]);
-	return do_fork(clone_flags, newsp, 0,
-		       (int __user *)parent_tidptr, (int __user *)child_tidptr);
+
+	kargs.flags		= (lower_32_bits(clone_flags) & ~CSIGNAL);
+	kargs.pidfd		= (int __user *)parent_tidptr;
+	kargs.child_tid		= (int __user *)child_tidptr;
+	kargs.parent_tid	= (int __user *)parent_tidptr;
+	kargs.exit_signal	= (lower_32_bits(clone_flags) & CSIGNAL);
+	kargs.stack		= newsp;
+
+	return _do_fork(&kargs);
 }
diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
index ac07f5f..d294e71 100644
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -50,8 +50,8 @@ void arch_cpu_idle(void)
 /*
  * Copy architecture-specific thread state
  */
-int copy_thread(unsigned long clone_flags, unsigned long usp,
-		unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct thread_info *ti = task_thread_info(p);
 	struct hexagon_switch_stack *ss;
@@ -100,7 +100,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
 	 * ugp is used to provide TLS support.
 	 */
 	if (clone_flags & CLONE_SETTLS)
-		childregs->ugp = childregs->r04;
+		childregs->ugp = tls;
 
 	/*
 	 * Parent sees new pid -- not necessary, not even possible at
diff --git a/arch/ia64/kernel/entry.S b/arch/ia64/kernel/entry.S
index c5efac2..e98e3da 100644
--- a/arch/ia64/kernel/entry.S
+++ b/arch/ia64/kernel/entry.S
@@ -112,19 +112,16 @@
 	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8)
 	alloc r16=ar.pfs,8,2,6,0
 	DO_SAVE_SWITCH_STACK
-	adds r2=PT(R16)+IA64_SWITCH_STACK_SIZE+16,sp
 	mov loc0=rp
-	mov loc1=r16				// save ar.pfs across do_fork
+	mov loc1=r16                             // save ar.pfs across ia64_clone
 	.body
+	mov out0=in0
 	mov out1=in1
 	mov out2=in2
-	tbit.nz p6,p0=in0,CLONE_SETTLS_BIT
-	mov out3=in3	// parent_tidptr: valid only w/CLONE_PARENT_SETTID
-	;;
-(p6)	st8 [r2]=in5				// store TLS in r16 for copy_thread()
-	mov out4=in4	// child_tidptr:  valid only w/CLONE_CHILD_SETTID or CLONE_CHILD_CLEARTID
-	mov out0=in0				// out0 = clone_flags
-	br.call.sptk.many rp=do_fork
+	mov out3=in3
+	mov out4=in4
+	mov out5=in5
+	br.call.sptk.many rp=ia64_clone
 .ret1:	.restore sp
 	adds sp=IA64_SWITCH_STACK_SIZE,sp	// pop the switch stack
 	mov ar.pfs=loc1
@@ -143,19 +140,16 @@
 	.prologue ASM_UNW_PRLG_RP|ASM_UNW_PRLG_PFS, ASM_UNW_PRLG_GRSAVE(8)
 	alloc r16=ar.pfs,8,2,6,0
 	DO_SAVE_SWITCH_STACK
-	adds r2=PT(R16)+IA64_SWITCH_STACK_SIZE+16,sp
 	mov loc0=rp
-	mov loc1=r16				// save ar.pfs across do_fork
+	mov loc1=r16                             // save ar.pfs across ia64_clone
 	.body
+	mov out0=in0
 	mov out1=in1
 	mov out2=16				// stacksize (compensates for 16-byte scratch area)
-	tbit.nz p6,p0=in0,CLONE_SETTLS_BIT
-	mov out3=in2	// parent_tidptr: valid only w/CLONE_PARENT_SETTID
-	;;
-(p6)	st8 [r2]=in4				// store TLS in r13 (tp)
-	mov out4=in3	// child_tidptr:  valid only w/CLONE_CHILD_SETTID or CLONE_CHILD_CLEARTID
-	mov out0=in0				// out0 = clone_flags
-	br.call.sptk.many rp=do_fork
+	mov out3=in3
+	mov out4=in4
+	mov out5=in5
+	br.call.sptk.many rp=ia64_clone
 .ret2:	.restore sp
 	adds sp=IA64_SWITCH_STACK_SIZE,sp	// pop the switch stack
 	mov ar.pfs=loc1
@@ -590,7 +584,7 @@
 	nop.i 0
 	/*
 	 * We need to call schedule_tail() to complete the scheduling process.
-	 * Called by ia64_switch_to() after do_fork()->copy_thread().  r8 contains the
+	 * Called by ia64_switch_to() after ia64_clone()->copy_thread().  r8 contains the
 	 * address of the previously executing task.
 	 */
 	br.call.sptk.many rp=ia64_invoke_schedule_tail
diff --git a/arch/ia64/kernel/process.c b/arch/ia64/kernel/process.c
index da55b41..7a4de9d 100644
--- a/arch/ia64/kernel/process.c
+++ b/arch/ia64/kernel/process.c
@@ -296,7 +296,7 @@ ia64_load_extra (struct task_struct *task)
 		pfm_load_regs(task);
 
 	info = __this_cpu_read(pfm_syst_info);
-	if (info & PFM_CPUINFO_SYST_WIDE) 
+	if (info & PFM_CPUINFO_SYST_WIDE)
 		pfm_syst_wide_update_task(task, info, 1);
 #endif
 }
@@ -310,7 +310,7 @@ ia64_load_extra (struct task_struct *task)
  *
  *	<clone syscall>	        <some kernel call frames>
  *	sys_clone		   :
- *	do_fork			do_fork
+ *	_do_fork		_do_fork
  *	copy_thread		copy_thread
  *
  * This means that the stack layout is as follows:
@@ -333,9 +333,8 @@ ia64_load_extra (struct task_struct *task)
  * so there is nothing to worry about.
  */
 int
-copy_thread(unsigned long clone_flags,
-	     unsigned long user_stack_base, unsigned long user_stack_size,
-	     struct task_struct *p)
+copy_thread(unsigned long clone_flags, unsigned long user_stack_base,
+	    unsigned long user_stack_size, struct task_struct *p, unsigned long tls)
 {
 	extern char ia64_ret_from_clone;
 	struct switch_stack *child_stack, *stack;
@@ -416,7 +415,7 @@ copy_thread(unsigned long clone_flags,
 	rbs_size = stack->ar_bspstore - rbs;
 	memcpy((void *) child_rbs, (void *) rbs, rbs_size);
 	if (clone_flags & CLONE_SETTLS)
-		child_ptregs->r13 = regs->r16;	/* see sys_clone2() in entry.S */
+		child_ptregs->r13 = tls;
 	if (user_stack_base) {
 		child_ptregs->r12 = user_stack_base + user_stack_size - 16;
 		child_ptregs->ar_bspstore = user_stack_base;
@@ -441,6 +440,24 @@ copy_thread(unsigned long clone_flags,
 	return retval;
 }
 
+asmlinkage long ia64_clone(unsigned long clone_flags, unsigned long stack_start,
+			   unsigned long stack_size, unsigned long parent_tidptr,
+			   unsigned long child_tidptr, unsigned long tls)
+{
+	struct kernel_clone_args args = {
+		.flags		= (lower_32_bits(clone_flags) & ~CSIGNAL),
+		.pidfd		= (int __user *)parent_tidptr,
+		.child_tid	= (int __user *)child_tidptr,
+		.parent_tid	= (int __user *)parent_tidptr,
+		.exit_signal	= (lower_32_bits(clone_flags) & CSIGNAL),
+		.stack		= stack_start,
+		.stack_size	= stack_size,
+		.tls		= tls,
+	};
+
+	return _do_fork(&args);
+}
+
 static void
 do_copy_task_regs (struct task_struct *task, struct unw_frame_info *info, void *arg)
 {
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 6ad6cda..6663f17 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -14,7 +14,6 @@
 	select HAVE_AOUT if MMU
 	select HAVE_ASM_MODVERSIONS
 	select HAVE_DEBUG_BUGVERBOSE
-	select HAVE_COPY_THREAD_TLS
 	select GENERIC_IRQ_SHOW
 	select GENERIC_ATOMIC64
 	select HAVE_UID16
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index 90ae376..6492a2c 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -125,9 +125,6 @@ asmlinkage int m68k_clone(struct pt_regs *regs)
 		.tls		= regs->d5,
 	};
 
-	if (!legacy_clone_args_valid(&args))
-		return -EINVAL;
-
 	return _do_fork(&args);
 }
 
@@ -141,9 +138,8 @@ asmlinkage int m68k_clone3(struct pt_regs *regs)
 	return sys_clone3((struct clone_args __user *)regs->d1, regs->d2);
 }
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp,
-		    unsigned long arg, struct task_struct *p,
-		    unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct fork_frame {
 		struct switch_stack sw;
diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
index 6527ec2..6cabeab9 100644
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -54,8 +54,8 @@ void flush_thread(void)
 {
 }
 
-int copy_thread(unsigned long clone_flags, unsigned long usp,
-		unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs = task_pt_regs(p);
 	struct thread_info *ti = task_thread_info(p);
@@ -114,7 +114,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
 	 *  which contains TLS area
 	 */
 	if (clone_flags & CLONE_SETTLS)
-		childregs->r21 = childregs->r10;
+		childregs->r21 = tls;
 
 	return 0;
 }
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 6fee1a1..ca92c3e 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -51,7 +51,6 @@
 	select HAVE_CBPF_JIT if !64BIT && !CPU_MICROMIPS
 	select HAVE_CONTEXT_TRACKING
 	select HAVE_TIF_NOHZ
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index ff5320b..f5dc316 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -119,8 +119,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 /*
  * Copy architecture-specific thread state
  */
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp,
-	unsigned long kthread_arg, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+		unsigned long kthread_arg, struct task_struct *p,
+		unsigned long tls)
 {
 	struct thread_info *ti = task_thread_info(p);
 	struct pt_regs *childregs, *regs = current_pt_regs();
diff --git a/arch/nds32/kernel/process.c b/arch/nds32/kernel/process.c
index 9712fd4..e85bbba 100644
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -150,7 +150,7 @@ DEFINE_PER_CPU(struct task_struct *, __entry_task);
 
 asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
 int copy_thread(unsigned long clone_flags, unsigned long stack_start,
-		unsigned long stk_sz, struct task_struct *p)
+		unsigned long stk_sz, struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs = task_pt_regs(p);
 
@@ -170,7 +170,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 		childregs->uregs[0] = 0;
 		childregs->osp = 0;
 		if (clone_flags & CLONE_SETTLS)
-			childregs->uregs[25] = childregs->uregs[3];
+			childregs->uregs[25] = tls;
 	}
 	/* cpu context switching  */
 	p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
diff --git a/arch/nios2/kernel/entry.S b/arch/nios2/kernel/entry.S
index 3d8d1d0..da84424 100644
--- a/arch/nios2/kernel/entry.S
+++ b/arch/nios2/kernel/entry.S
@@ -389,12 +389,7 @@
  */
 ENTRY(sys_clone)
 	SAVE_SWITCH_STACK
-	addi	sp, sp, -4
-	stw	r7, 0(sp)	/* Pass 5th arg thru stack */
-	mov	r7, r6		/* 4th arg is 3rd of clone() */
-	mov	r6, zero	/* 3rd arg always 0 */
-	call	do_fork
-	addi	sp, sp, 4
+	call	nios2_clone
 	RESTORE_SWITCH_STACK
 	ret
 
diff --git a/arch/nios2/kernel/process.c b/arch/nios2/kernel/process.c
index 509e785..0a42ab8 100644
--- a/arch/nios2/kernel/process.c
+++ b/arch/nios2/kernel/process.c
@@ -100,8 +100,8 @@ void flush_thread(void)
 {
 }
 
-int copy_thread(unsigned long clone_flags,
-		unsigned long usp, unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs = task_pt_regs(p);
 	struct pt_regs *regs;
@@ -140,7 +140,7 @@ int copy_thread(unsigned long clone_flags,
 
 	/* Initialize tls register. */
 	if (clone_flags & CLONE_SETTLS)
-		childstack->r23 = regs->r8;
+		childstack->r23 = tls;
 
 	return 0;
 }
@@ -259,3 +259,20 @@ int dump_fpu(struct pt_regs *regs, elf_fpregset_t *r)
 {
 	return 0; /* Nios2 has no FPU and thus no FPU registers */
 }
+
+asmlinkage int nios2_clone(unsigned long clone_flags, unsigned long newsp,
+			   int __user *parent_tidptr, int __user *child_tidptr,
+			   unsigned long tls)
+{
+	struct kernel_clone_args args = {
+		.flags		= (lower_32_bits(clone_flags) & ~CSIGNAL),
+		.pidfd		= parent_tidptr,
+		.child_tid	= child_tidptr,
+		.parent_tid	= parent_tidptr,
+		.exit_signal	= (lower_32_bits(clone_flags) & CSIGNAL),
+		.stack		= newsp,
+		.tls		= tls,
+	};
+
+	return _do_fork(&args);
+}
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 8588996..7e94fe3 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -16,7 +16,6 @@
 	select HANDLE_DOMAIN_IRQ
 	select GPIOLIB
 	select HAVE_ARCH_TRACEHOOK
-	select HAVE_COPY_THREAD_TLS
 	select SPARSE_IRQ
 	select GENERIC_IRQ_CHIP
 	select GENERIC_IRQ_PROBE
diff --git a/arch/openrisc/kernel/process.c b/arch/openrisc/kernel/process.c
index d7010e7..848f74c 100644
--- a/arch/openrisc/kernel/process.c
+++ b/arch/openrisc/kernel/process.c
@@ -116,7 +116,7 @@ void release_thread(struct task_struct *dead_task)
 extern asmlinkage void ret_from_fork(void);
 
 /*
- * copy_thread_tls
+ * copy_thread
  * @clone_flags: flags
  * @usp: user stack pointer or fn for kernel thread
  * @arg: arg to fn for kernel thread; always NULL for userspace thread
@@ -147,8 +147,8 @@ extern asmlinkage void ret_from_fork(void);
  */
 
 int
-copy_thread_tls(unsigned long clone_flags, unsigned long usp,
-		unsigned long arg, struct task_struct *p, unsigned long tls)
+copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+	    struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *userregs;
 	struct pt_regs *kregs;
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 8e4c370..2667eeb 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -62,7 +62,6 @@
 	select HAVE_FTRACE_MCOUNT_RECORD if HAVE_DYNAMIC_FTRACE
 	select HAVE_KPROBES_ON_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
-	select HAVE_COPY_THREAD_TLS
 
 	help
 	  The PA-RISC microprocessor is designed by Hewlett-Packard and used
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index b7abb12..de6299f 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -208,7 +208,7 @@ arch_initcall(parisc_idle_init);
  * Copy architecture-specific thread state
  */
 int
-copy_thread_tls(unsigned long clone_flags, unsigned long usp,
+copy_thread(unsigned long clone_flags, unsigned long usp,
 	    unsigned long kthread_arg, struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *cregs = &(p->thread.regs);
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fa23eb..3b262d87 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -186,7 +186,6 @@
 	select HAVE_STACKPROTECTOR		if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
 	select HAVE_CONTEXT_TRACKING		if PPC64
 	select HAVE_TIF_NOHZ			if PPC64
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_STACKOVERFLOW
 	select HAVE_DYNAMIC_FTRACE
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 4650b9b..794b754 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1593,7 +1593,7 @@ static void setup_ksp_vsid(struct task_struct *p, unsigned long sp)
 /*
  * Copy architecture-specific thread state
  */
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp,
+int copy_thread(unsigned long clone_flags, unsigned long usp,
 		unsigned long kthread_arg, struct task_struct *p,
 		unsigned long tls)
 {
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 3230c1d..6c4bce7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -54,7 +54,6 @@
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ASM_MODVERSIONS
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DMA_CONTIGUOUS if MMU
 	select HAVE_EBPF_JIT if MMU
 	select HAVE_FUTEX_CMPXCHG if FUTEX
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 824d117..31f3944 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -101,8 +101,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	return 0;
 }
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp,
-	unsigned long arg, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct pt_regs *childregs = task_pt_regs(p);
 
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 9cfd8de..e55debe 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -136,7 +136,6 @@
 	select HAVE_EBPF_JIT if PACK_STACK && HAVE_MARCH_Z196_FEATURES
 	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CMPXCHG_LOCAL
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index eb6e23a..b06dec1 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -80,8 +80,8 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	return 0;
 }
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long new_stackp,
-		    unsigned long arg, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long new_stackp,
+		unsigned long arg, struct task_struct *p, unsigned long tls)
 {
 	struct fake_frame
 	{
diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c
index 456cc8d..b0fefd8 100644
--- a/arch/sh/kernel/process_32.c
+++ b/arch/sh/kernel/process_32.c
@@ -115,8 +115,8 @@ EXPORT_SYMBOL(dump_fpu);
 asmlinkage void ret_from_fork(void);
 asmlinkage void ret_from_kernel_thread(void);
 
-int copy_thread(unsigned long clone_flags, unsigned long usp,
-		unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct thread_info *ti = task_thread_info(p);
 	struct pt_regs *childregs;
@@ -158,7 +158,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
 	ti->addr_limit = USER_DS;
 
 	if (clone_flags & CLONE_SETTLS)
-		childregs->gbr = childregs->regs[0];
+		childregs->gbr = tls;
 
 	childregs->regs[0] = 0; /* Set return value for child */
 	p->thread.pc = (unsigned long) ret_from_fork;
diff --git a/arch/sparc/include/asm/syscalls.h b/arch/sparc/include/asm/syscalls.h
index 1d819f5..35575fb 100644
--- a/arch/sparc/include/asm/syscalls.h
+++ b/arch/sparc/include/asm/syscalls.h
@@ -4,9 +4,8 @@
 
 struct pt_regs;
 
-asmlinkage long sparc_do_fork(unsigned long clone_flags,
-			      unsigned long stack_start,
-			      struct pt_regs *regs,
-			      unsigned long stack_size);
+asmlinkage long sparc_fork(struct pt_regs *regs);
+asmlinkage long sparc_vfork(struct pt_regs *regs);
+asmlinkage long sparc_clone(struct pt_regs *regs);
 
 #endif /* _SPARC64_SYSCALLS_H */
diff --git a/arch/sparc/kernel/Makefile b/arch/sparc/kernel/Makefile
index 97c0e192..d3a0e07 100644
--- a/arch/sparc/kernel/Makefile
+++ b/arch/sparc/kernel/Makefile
@@ -33,6 +33,7 @@
 obj-$(CONFIG_SPARC32)   += sun4m_irq.o sun4d_irq.o
 
 obj-y                   += process_$(BITS).o
+obj-y                   += process.o
 obj-y                   += signal_$(BITS).o
 obj-y                   += sigutil_$(BITS).o
 obj-$(CONFIG_SPARC32)   += ioport.o
diff --git a/arch/sparc/kernel/entry.S b/arch/sparc/kernel/entry.S
index f636acf..d589402 100644
--- a/arch/sparc/kernel/entry.S
+++ b/arch/sparc/kernel/entry.S
@@ -869,14 +869,11 @@
 	ld	[%curptr + TI_TASK], %o4
 	rd	%psr, %g4
 	WRITE_PAUSE
-	mov	SIGCHLD, %o0			! arg0:	clone flags
 	rd	%wim, %g5
 	WRITE_PAUSE
-	mov	%fp, %o1			! arg1:	usp
 	std	%g4, [%o4 + AOFF_task_thread + AOFF_thread_fork_kpsr]
-	add	%sp, STACKFRAME_SZ, %o2		! arg2:	pt_regs ptr
-	mov	0, %o3
-	call	sparc_do_fork
+	add	%sp, STACKFRAME_SZ, %o0
+	call	sparc_fork
 	 mov	%l5, %o7
 
 	/* Whee, kernel threads! */
@@ -888,19 +885,11 @@
 	ld	[%curptr + TI_TASK], %o4
 	rd	%psr, %g4
 	WRITE_PAUSE
-
-	/* arg0,1: flags,usp  -- loaded already */
-	cmp	%o1, 0x0			! Is new_usp NULL?
 	rd	%wim, %g5
 	WRITE_PAUSE
-	be,a	1f
-	 mov	%fp, %o1			! yes, use callers usp
-	andn	%o1, 7, %o1			! no, align to 8 bytes
-1:
 	std	%g4, [%o4 + AOFF_task_thread + AOFF_thread_fork_kpsr]
-	add	%sp, STACKFRAME_SZ, %o2		! arg2:	pt_regs ptr
-	mov	0, %o3
-	call	sparc_do_fork
+	add	%sp, STACKFRAME_SZ, %o0
+	call	sparc_clone
 	 mov	%l5, %o7
 
 	/* Whee, real vfork! */
@@ -914,13 +903,9 @@
 	rd	%wim, %g5
 	WRITE_PAUSE
 	std	%g4, [%o4 + AOFF_task_thread + AOFF_thread_fork_kpsr]
-	sethi	%hi(0x4000 | 0x0100 | SIGCHLD), %o0
-	mov	%fp, %o1
-	or	%o0, %lo(0x4000 | 0x0100 | SIGCHLD), %o0
-	sethi	%hi(sparc_do_fork), %l1
-	mov	0, %o3
-	jmpl	%l1 + %lo(sparc_do_fork), %g0
-	 add	%sp, STACKFRAME_SZ, %o2
+	sethi	%hi(sparc_vfork), %l1
+	jmpl	%l1 + %lo(sparc_vfork), %g0
+	 add	%sp, STACKFRAME_SZ, %o0
 
         .align  4
 linux_sparc_ni_syscall:
diff --git a/arch/sparc/kernel/kernel.h b/arch/sparc/kernel/kernel.h
index f6f498b..9cd09a3 100644
--- a/arch/sparc/kernel/kernel.h
+++ b/arch/sparc/kernel/kernel.h
@@ -14,6 +14,11 @@ extern const char *sparc_pmu_type;
 extern unsigned int fsr_storage;
 extern int ncpus_probed;
 
+/* process{_32,_64}.c */
+asmlinkage long sparc_clone(struct pt_regs *regs);
+asmlinkage long sparc_fork(struct pt_regs *regs);
+asmlinkage long sparc_vfork(struct pt_regs *regs);
+
 #ifdef CONFIG_SPARC64
 /* setup_64.c */
 struct seq_file;
@@ -153,12 +158,6 @@ void floppy_hardint(void);
 extern unsigned long sun4m_cpu_startup;
 extern unsigned long sun4d_cpu_startup;
 
-/* process_32.c */
-asmlinkage int sparc_do_fork(unsigned long clone_flags,
-                             unsigned long stack_start,
-                             struct pt_regs *regs,
-                             unsigned long stack_size);
-
 /* signal_32.c */
 asmlinkage void do_sigreturn(struct pt_regs *regs);
 asmlinkage void do_rt_sigreturn(struct pt_regs *regs);
diff --git a/arch/sparc/kernel/process.c b/arch/sparc/kernel/process.c
new file mode 100644
index 0000000..5234b5c
--- /dev/null
+++ b/arch/sparc/kernel/process.c
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * This file handles the architecture independent parts of process handling..
+ */
+
+#include <linux/compat.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/ptrace.h>
+#include <linux/sched.h>
+#include <linux/sched/task.h>
+#include <linux/sched/task_stack.h>
+#include <linux/signal.h>
+
+#include "kernel.h"
+
+asmlinkage long sparc_fork(struct pt_regs *regs)
+{
+	unsigned long orig_i1 = regs->u_regs[UREG_I1];
+	long ret;
+	struct kernel_clone_args args = {
+		.exit_signal	= SIGCHLD,
+		/* Reuse the parent's stack for the child. */
+		.stack		= regs->u_regs[UREG_FP],
+	};
+
+	ret = _do_fork(&args);
+
+	/* If we get an error and potentially restart the system
+	 * call, we're screwed because copy_thread() clobbered
+	 * the parent's %o1.  So detect that case and restore it
+	 * here.
+	 */
+	if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
+		regs->u_regs[UREG_I1] = orig_i1;
+
+	return ret;
+}
+
+asmlinkage long sparc_vfork(struct pt_regs *regs)
+{
+	unsigned long orig_i1 = regs->u_regs[UREG_I1];
+	long ret;
+
+	struct kernel_clone_args args = {
+		.flags		= CLONE_VFORK | CLONE_VM,
+		.exit_signal	= SIGCHLD,
+		/* Reuse the parent's stack for the child. */
+		.stack		= regs->u_regs[UREG_FP],
+	};
+
+	ret = _do_fork(&args);
+
+	/* If we get an error and potentially restart the system
+	 * call, we're screwed because copy_thread() clobbered
+	 * the parent's %o1.  So detect that case and restore it
+	 * here.
+	 */
+	if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
+		regs->u_regs[UREG_I1] = orig_i1;
+
+	return ret;
+}
+
+asmlinkage long sparc_clone(struct pt_regs *regs)
+{
+	unsigned long orig_i1 = regs->u_regs[UREG_I1];
+	unsigned int flags = lower_32_bits(regs->u_regs[UREG_I0]);
+	long ret;
+
+	struct kernel_clone_args args = {
+		.flags		= (flags & ~CSIGNAL),
+		.exit_signal	= (flags & CSIGNAL),
+		.tls		= regs->u_regs[UREG_I3],
+	};
+
+#ifdef CONFIG_COMPAT
+	if (test_thread_flag(TIF_32BIT)) {
+		args.pidfd	= compat_ptr(regs->u_regs[UREG_I2]);
+		args.child_tid	= compat_ptr(regs->u_regs[UREG_I4]);
+		args.parent_tid	= compat_ptr(regs->u_regs[UREG_I2]);
+	} else
+#endif
+	{
+		args.pidfd	= (int __user *)regs->u_regs[UREG_I2];
+		args.child_tid	= (int __user *)regs->u_regs[UREG_I4];
+		args.parent_tid	= (int __user *)regs->u_regs[UREG_I2];
+	}
+
+	/* Did userspace give setup a separate stack for the child or are we
+	 * reusing the parent's?
+	 */
+	if (regs->u_regs[UREG_I1])
+		args.stack = regs->u_regs[UREG_I1];
+	else
+		args.stack = regs->u_regs[UREG_FP];
+
+	ret = _do_fork(&args);
+
+	/* If we get an error and potentially restart the system
+	 * call, we're screwed because copy_thread() clobbered
+	 * the parent's %o1.  So detect that case and restore it
+	 * here.
+	 */
+	if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
+		regs->u_regs[UREG_I1] = orig_i1;
+
+	return ret;
+}
diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index 13cb563..bd123f1 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -257,33 +257,6 @@ clone_stackframe(struct sparc_stackf __user *dst,
 	return sp;
 }
 
-asmlinkage int sparc_do_fork(unsigned long clone_flags,
-                             unsigned long stack_start,
-                             struct pt_regs *regs,
-                             unsigned long stack_size)
-{
-	unsigned long parent_tid_ptr, child_tid_ptr;
-	unsigned long orig_i1 = regs->u_regs[UREG_I1];
-	long ret;
-
-	parent_tid_ptr = regs->u_regs[UREG_I2];
-	child_tid_ptr = regs->u_regs[UREG_I4];
-
-	ret = do_fork(clone_flags, stack_start, stack_size,
-		      (int __user *) parent_tid_ptr,
-		      (int __user *) child_tid_ptr);
-
-	/* If we get an error and potentially restart the system
-	 * call, we're screwed because copy_thread() clobbered
-	 * the parent's %o1.  So detect that case and restore it
-	 * here.
-	 */
-	if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
-		regs->u_regs[UREG_I1] = orig_i1;
-
-	return ret;
-}
-
 /* Copy a Sparc thread.  The fork() return value conventions
  * under SunOS are nothing short of bletcherous:
  * Parent -->  %o0 == childs  pid, %o1 == 0
@@ -300,8 +273,8 @@ asmlinkage int sparc_do_fork(unsigned long clone_flags,
 extern void ret_from_fork(void);
 extern void ret_from_kernel_thread(void);
 
-int copy_thread(unsigned long clone_flags, unsigned long sp,
-		unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct thread_info *ti = task_thread_info(p);
 	struct pt_regs *childregs, *regs = current_pt_regs();
@@ -403,7 +376,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
 	regs->u_regs[UREG_I1] = 0;
 
 	if (clone_flags & CLONE_SETTLS)
-		childregs->u_regs[UREG_G7] = regs->u_regs[UREG_I3];
+		childregs->u_regs[UREG_G7] = tls;
 
 	return 0;
 }
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 54945ea..04ef19b 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -572,47 +572,13 @@ void fault_in_user_windows(struct pt_regs *regs)
 	force_sig(SIGSEGV);
 }
 
-asmlinkage long sparc_do_fork(unsigned long clone_flags,
-			      unsigned long stack_start,
-			      struct pt_regs *regs,
-			      unsigned long stack_size)
-{
-	int __user *parent_tid_ptr, *child_tid_ptr;
-	unsigned long orig_i1 = regs->u_regs[UREG_I1];
-	long ret;
-
-#ifdef CONFIG_COMPAT
-	if (test_thread_flag(TIF_32BIT)) {
-		parent_tid_ptr = compat_ptr(regs->u_regs[UREG_I2]);
-		child_tid_ptr = compat_ptr(regs->u_regs[UREG_I4]);
-	} else
-#endif
-	{
-		parent_tid_ptr = (int __user *) regs->u_regs[UREG_I2];
-		child_tid_ptr = (int __user *) regs->u_regs[UREG_I4];
-	}
-
-	ret = do_fork(clone_flags, stack_start, stack_size,
-		      parent_tid_ptr, child_tid_ptr);
-
-	/* If we get an error and potentially restart the system
-	 * call, we're screwed because copy_thread() clobbered
-	 * the parent's %o1.  So detect that case and restore it
-	 * here.
-	 */
-	if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
-		regs->u_regs[UREG_I1] = orig_i1;
-
-	return ret;
-}
-
 /* Copy a Sparc thread.  The fork() return value conventions
  * under SunOS are nothing short of bletcherous:
  * Parent -->  %o0 == childs  pid, %o1 == 0
  * Child  -->  %o0 == parents pid, %o1 == 1
  */
-int copy_thread(unsigned long clone_flags, unsigned long sp,
-		unsigned long arg, struct task_struct *p)
+int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct thread_info *t = task_thread_info(p);
 	struct pt_regs *regs = current_pt_regs();
@@ -670,7 +636,7 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
 	regs->u_regs[UREG_I1] = 0;
 
 	if (clone_flags & CLONE_SETTLS)
-		t->kregs->u_regs[UREG_G7] = regs->u_regs[UREG_I3];
+		t->kregs->u_regs[UREG_G7] = tls;
 
 	return 0;
 }
diff --git a/arch/sparc/kernel/syscalls.S b/arch/sparc/kernel/syscalls.S
index db42b4f..0e8ab06 100644
--- a/arch/sparc/kernel/syscalls.S
+++ b/arch/sparc/kernel/syscalls.S
@@ -86,19 +86,22 @@
 	 * during system calls...
 	 */
 	.align	32
-sys_vfork: /* Under Linux, vfork and fork are just special cases of clone. */
-	sethi	%hi(0x4000 | 0x0100 | SIGCHLD), %o0
-	or	%o0, %lo(0x4000 | 0x0100 | SIGCHLD), %o0
-	ba,pt	%xcc, sys_clone
+sys_vfork:
+	flushw
+	ba,pt	%xcc, sparc_vfork
+	 add	%sp, PTREGS_OFF, %o0
+
+	.align	32
 sys_fork:
-	 clr	%o1
-	mov	SIGCHLD, %o0
+	flushw
+	ba,pt	%xcc, sparc_fork
+	 add	%sp, PTREGS_OFF, %o0
+
+	.align	32
 sys_clone:
 	flushw
-	movrz	%o1, %fp, %o1
-	mov	0, %o3
-	ba,pt	%xcc, sparc_do_fork
-	 add	%sp, PTREGS_OFF, %o2
+	ba,pt	%xcc, sparc_clone
+	 add	%sp, PTREGS_OFF, %o0
 
 	.globl	ret_from_fork
 ret_from_fork:
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 9318dc6..ef69be1 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -14,7 +14,6 @@
 	select HAVE_FUTEX_CMPXCHG if FUTEX
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DEBUG_BUGVERBOSE
-	select HAVE_COPY_THREAD_TLS
 	select GENERIC_IRQ_SHOW
 	select GENERIC_CPU_DEVICES
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/um/kernel/process.c b/arch/um/kernel/process.c
index e3a2cf9..26b5e243 100644
--- a/arch/um/kernel/process.c
+++ b/arch/um/kernel/process.c
@@ -152,7 +152,7 @@ void fork_handler(void)
 	userspace(&current->thread.regs.regs, current_thread_info()->aux_fp_regs);
 }
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
+int copy_thread(unsigned long clone_flags, unsigned long sp,
 		unsigned long arg, struct task_struct * p, unsigned long tls)
 {
 	void (*handler)(void);
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 812baf2..addb27d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -161,7 +161,6 @@
 	select HAVE_CMPXCHG_DOUBLE
 	select HAVE_CMPXCHG_LOCAL
 	select HAVE_CONTEXT_TRACKING		if X86_64
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_C_RECORDMCOUNT
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index fe67dbd..4298634 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -121,8 +121,8 @@ static int set_new_tls(struct task_struct *p, unsigned long tls)
 		return do_set_thread_area_64(p, ARCH_SET_FS, tls);
 }
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
-		    unsigned long arg, struct task_struct *p, unsigned long tls)
+int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
+		struct task_struct *p, unsigned long tls)
 {
 	struct inactive_task_frame *frame;
 	struct fork_frame *fork_frame;
diff --git a/arch/x86/kernel/sys_ia32.c b/arch/x86/kernel/sys_ia32.c
index f8d65c9..720cde8 100644
--- a/arch/x86/kernel/sys_ia32.c
+++ b/arch/x86/kernel/sys_ia32.c
@@ -251,9 +251,6 @@ COMPAT_SYSCALL_DEFINE5(ia32_clone, unsigned long, clone_flags,
 		.tls		= tls_val,
 	};
 
-	if (!legacy_clone_args_valid(&args))
-		return -EINVAL;
-
 	return _do_fork(&args);
 }
 #endif /* CONFIG_IA32_EMULATION */
diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index e40b494..d7c44b2 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -269,7 +269,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		/*
 		 * kthreads (other than the boot CPU's idle thread) have some
 		 * partial regs at the end of their stack which were placed
-		 * there by copy_thread_tls().  But the regs don't have any
+		 * there by copy_thread().  But the regs don't have any
 		 * useful information, so we can skip them.
 		 *
 		 * This user_mode() check is slightly broader than a PF_KTHREAD
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 3a9f1e8..b71ba91 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -24,7 +24,6 @@
 	select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL
 	select HAVE_ARCH_KASAN if MMU && !XIP_KERNEL
 	select HAVE_ARCH_TRACEHOOK
-	select HAVE_COPY_THREAD_TLS
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_EXIT_THREAD
diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
index b7fe6f4..397a7de 100644
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -201,7 +201,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
  * involved.  Much simpler to just not copy those live frames across.
  */
 
-int copy_thread_tls(unsigned long clone_flags, unsigned long usp_thread_fn,
+int copy_thread(unsigned long clone_flags, unsigned long usp_thread_fn,
 		unsigned long thread_fn_arg, struct task_struct *p,
 		unsigned long tls)
 {
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 27b4fa4..ae3060f 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -66,22 +66,9 @@ extern void fork_init(void);
 
 extern void release_task(struct task_struct * p);
 
-#ifdef CONFIG_HAVE_COPY_THREAD_TLS
-extern int copy_thread_tls(unsigned long, unsigned long, unsigned long,
-			struct task_struct *, unsigned long);
-#else
 extern int copy_thread(unsigned long, unsigned long, unsigned long,
-			struct task_struct *);
+		       struct task_struct *, unsigned long);
 
-/* Architectures that haven't opted into copy_thread_tls get the tls argument
- * via pt_regs, so ignore the tls argument passed via C. */
-static inline int copy_thread_tls(
-		unsigned long clone_flags, unsigned long sp, unsigned long arg,
-		struct task_struct *p, unsigned long tls)
-{
-	return copy_thread(clone_flags, sp, arg, p);
-}
-#endif
 extern void flush_thread(void);
 
 #ifdef CONFIG_HAVE_EXIT_THREAD
@@ -97,8 +84,6 @@ extern void exit_files(struct task_struct *);
 extern void exit_itimers(struct signal_struct *);
 
 extern long _do_fork(struct kernel_clone_args *kargs);
-extern bool legacy_clone_args_valid(const struct kernel_clone_args *kargs);
-extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *);
 struct task_struct *fork_idle(int);
 struct mm_struct *copy_init_mm(void);
 extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
diff --git a/kernel/fork.c b/kernel/fork.c
index 40996d2..b7da900 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2097,8 +2097,7 @@ static __latent_entropy struct task_struct *copy_process(
 	retval = copy_io(clone_flags, p);
 	if (retval)
 		goto bad_fork_cleanup_namespaces;
-	retval = copy_thread_tls(clone_flags, args->stack, args->stack_size, p,
-				 args->tls);
+	retval = copy_thread(clone_flags, args->stack, args->stack_size, p, args->tls);
 	if (retval)
 		goto bad_fork_cleanup_io;
 
@@ -2417,6 +2416,20 @@ long _do_fork(struct kernel_clone_args *args)
 	long nr;
 
 	/*
+	 * For legacy clone() calls, CLONE_PIDFD uses the parent_tid argument
+	 * to return the pidfd. Hence, CLONE_PIDFD and CLONE_PARENT_SETTID are
+	 * mutually exclusive. With clone3() CLONE_PIDFD has grown a separate
+	 * field in struct clone_args and it still doesn't make sense to have
+	 * them both point at the same memory location. Performing this check
+	 * here has the advantage that we don't need to have a separate helper
+	 * to check for legacy clone().
+	 */
+	if ((args->flags & CLONE_PIDFD) &&
+	    (args->flags & CLONE_PARENT_SETTID) &&
+	    (args->pidfd == args->parent_tid))
+		return -EINVAL;
+
+	/*
 	 * Determine whether and which event to report to ptracer.  When
 	 * called from kernel_thread or CLONE_UNTRACED is explicitly
 	 * requested, no event is reported; otherwise, report if the event
@@ -2473,42 +2486,6 @@ long _do_fork(struct kernel_clone_args *args)
 	return nr;
 }
 
-bool legacy_clone_args_valid(const struct kernel_clone_args *kargs)
-{
-	/* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */
-	if ((kargs->flags & CLONE_PIDFD) &&
-	    (kargs->flags & CLONE_PARENT_SETTID))
-		return false;
-
-	return true;
-}
-
-#ifndef CONFIG_HAVE_COPY_THREAD_TLS
-/* For compatibility with architectures that call do_fork directly rather than
- * using the syscall entry points below. */
-long do_fork(unsigned long clone_flags,
-	      unsigned long stack_start,
-	      unsigned long stack_size,
-	      int __user *parent_tidptr,
-	      int __user *child_tidptr)
-{
-	struct kernel_clone_args args = {
-		.flags		= (lower_32_bits(clone_flags) & ~CSIGNAL),
-		.pidfd		= parent_tidptr,
-		.child_tid	= child_tidptr,
-		.parent_tid	= parent_tidptr,
-		.exit_signal	= (lower_32_bits(clone_flags) & CSIGNAL),
-		.stack		= stack_start,
-		.stack_size	= stack_size,
-	};
-
-	if (!legacy_clone_args_valid(&args))
-		return -EINVAL;
-
-	return _do_fork(&args);
-}
-#endif
-
 /*
  * Create a kernel thread.
  */
@@ -2587,24 +2564,12 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
 		.tls		= tls,
 	};
 
-	if (!legacy_clone_args_valid(&args))
-		return -EINVAL;
-
 	return _do_fork(&args);
 }
 #endif
 
 #ifdef __ARCH_WANT_SYS_CLONE3
 
-/*
- * copy_thread implementations handle CLONE_SETTLS by reading the TLS value from
- * the registers containing the syscall arguments for clone. This doesn't work
- * with clone3 since the TLS value is passed in clone_args instead.
- */
-#ifndef CONFIG_HAVE_COPY_THREAD_TLS
-#error clone3 requires copy_thread_tls support in arch
-#endif
-
 noinline static int copy_clone_args_from_user(struct kernel_clone_args *kargs,
 					      struct clone_args __user *uargs,
 					      size_t usize)
@@ -2919,7 +2884,7 @@ static int unshare_fd(unsigned long unshare_flags, struct files_struct **new_fdp
 /*
  * unshare allows a process to 'unshare' part of the process
  * context which was originally shared using clone.  copy_*
- * functions used by do_fork() cannot be used here directly
+ * functions used by _do_fork() cannot be used here directly
  * because they modify an inactive task_struct that is being
  * constructed. Here we are modifying the current, active,
  * task_struct.